Jul 09

Trunk Now Open for 2.1 Changes

This post is probably only of interest to those playing with or working on the code for Programmer’s Notepad.

Programmer’s Notepad 2.0.10 RC is now out, and with that the code has been branched in subversion to the rel-2-0-10 branch.

This means that the trunk is open for big changes again, and there are some relatively big changes on the way – in fact for the next few weeks expect the trunk to be broken fairly often. Here are the changes coming in:

  1. Fixing the Unicode build – 2.1 will be released as a full Unicode rather than mixed mode build
  2. Updating to the newest Scintilla code, error handling model has changed completely
  3. Moving to Visual Studio 2008 SP1 for development instead of 2005
  4. Updating Boost to latest release

Apart from this (!) I don’t intend on taking many changes before 2.1 is released. You can see the current suggested list of items to fix in the tracker: Ellington Issues.

Feb 09

Creating your first Programmer’s Notepad extension

PyPN provides a great way to add functionality to Programmer’s Notepad by writing simple Python code, but you might want to do something more advanced. For this there’s the Programmer’s Notepad Extension SDK.

The SDK lets you extend PN using C++, allowing you to react to editor events and provide new commands in the menu. PyPN is itself implemented as an extension using this same SDK, and you can use the SDK to provide support for other scripting languages too.

What You’ll Need

You need a Windows C++ compiler and the Boost C++ library. Note that you don’t need to compile any of boost, we use the header-only bits.

I suggest using the free Microsoft Visual C++ Express if you don’t already have Visual Studio, this should guarantee compatibility.

Getting Started

Download the SDK and copy the template project, this is a good base for your extension. Note that the SDK also contains a demo extension showing use of various parts of the SDK. Change the name and version of your extension and you’re ready to add it to Programmer’s Notepad for the first time:

void __declspec(dllexport) __stdcall pn_get_extension_info(PN::BaseString& name, PN::BaseString& version)
    name = "My First Plugin";
    version = "1.0";

Compile the extension and place the .dll file in your PN directory. Now run “pn –findexts” and your plugin will be discovered and loaded the next time you start PN. Go to Tools->Options->Extensions and see your extension listed.

Everything else you want to do flows from the instance of IPN that’s passed to your init function. This interface gives you access to the open documents, lets you sign up to handle document events and gives you access to app-level services like script registration, find in files and options management.

Working with Documents

Everything you want to do with an open document is done through the IDocument interface. You get a pointer to one of these from your IPN instance by calling GetCurrentDocument, NewDocument or equivalent.

    // Make a new document
    IDocumentPtr doc = pn->NewDocument(NULL);
    // Send scintilla messages (see documentation)
    doc->SendEditorMessage(SCI_APPENDTEXT, 6, (LPARAM)"Hello!");

    // Save changes
    doc->Save("c:\\temp\\test.txt", true);

    // Done with the document

This is the first in a series of posts that will become the introductory documentation for extensions. Next time we’ll look at how to add menu commands for your plugin. The series will be added to the docs site as we go: Writing your First Extension

Apr 08

Boost::Xpressive and Scintilla

Programmer’s Notepad has long needed an improved Regular Expressions engine. Currently PN uses PCRE for all tasks but searching Scintilla. This is because PCRE doesn’t support searching anything but a memory buffer – i.e. it doesn’t support iterators. We need iterator (or indirect access) support because a regex engine for a text editor can’t expect all text for the editor to be in a single contiguous memory block.

Boost::Regex has been suggested several times, but it still doesn’t support named captures. When allowing users to specify regular expressions for use in parsing, named captures can significantly simply the process. For example, when using a regular expression to parse compiler output we have two alternatives:

  1. \s*(?P<f>.+)(?P<l>[0-9]+)(,(?P<c>[0-9]+))?\s*:

    This uses the standard named capture syntax to name the three capture blocks: “f” for filename, “l” for line and “c” for column. The single expression can be parsed and understood by PN without the user having to understand capture indexing.

  2. \s*(.+)([0-9]+)(,([0-9]+))?\s*:

    This uses basic regular expression capture groups and results in the user having to enter three additional pieces of non-obvious data: the capture index for each capture. In this case these would be 1, 2, and 4 but this would potentially change for each output pattern.


I believe that using named captures significantly improves the user experience around this, especially considering that PN uses %f, %l and %c to represent the three named capture groups meaning that users don’t even need to understand regular expression capture syntax to use them.

Boost 1.35 introduces version 2 of Boost.Xpressive, the other boost regular expressions engine. Boost.Xpressive naturally supports iterators. Version 2 supports named captures.

Implementing a Scintilla Iterator

Xpressive requires a bi-directional iterator class (one that can move forwards and backwards over the contents). I’ve currently implemented a very simple, naive iterator to prove that this can work:

 * std::iterator compatible iterator for Scintilla contents
class ScintillaIterator : 
public std::iterator<std::bidirectional_iterator_tag, char> { public: ScintillaIterator() : m_scintilla(0), m_pos(0), m_end(0) { } ScintillaIterator(CScintilla* scintilla, int pos) : m_scintilla(scintilla), m_pos(pos), m_end(scintilla->GetLength()) { } ScintillaIterator(const ScintillaIterator& copy) : m_scintilla(copy.m_scintilla), m_pos(copy.m_pos), m_end(copy.m_end) { } bool operator == (const ScintillaIterator& other) const { return (ended() == other.ended()) && (m_scintilla == other.m_scintilla) && (m_pos == other.m_pos); } bool operator != (const ScintillaIterator& other) const { return !(*this == other); } char operator * () const { return charAt(m_pos); } ScintillaIterator& operator ++ () { m_pos++; return *this; } ScintillaIterator& operator -- () { m_pos--; return *this; } int pos() const { return m_pos; } private: char charAt(int position) const { return m_scintilla->GetCharAt(position); } bool ended() const { return m_pos == m_end; } int m_pos; int m_end; CScintilla* m_scintilla; };


This can then be used with Xpressive like this:

typedef boost::xpressive::basic_regex<ScintillaIterator> sciregex;
typedef boost::xpressive::match_results<ScintillaIterator> scimatch;
typedef boost::xpressive::sub_match<ScintillaIterator> scisub_match;

void test()
    sciregex regex = sciregex::compile("[0-9]+");
    scimatch match; 
    if (regex_match(m_scintilla, match, regex))

This code is now available in Programmer’s Notepad subversion, and it seems to work. The iterator needs a bit of improvement to buffer data from Scintilla, or perhaps needs moving so that it doesn’t have to send a windows message for every character access. However, as a proof of concept it’s a good one and it suggests that we should be able to replace the current lacklustre regex searching support with fully featured multi-line support for the next release.

Jul 07

Regular Expressions Enhancements

Programmer’s Notepad currently uses two different regex engines for different parts of code:

  1. The excellent PCRE: Used by the PN code for matching output strings and in a couple of other internal bits of code.
  2. The tiny engine built into Scintilla. This is a very limited regular expressions engine, designed for embedded scintilla use rather than use in a full powered text editor. It’s currently used for all user regex searches.

The original plan was to switch to using PCRE as the engine for searching in the editor as well. However, PCRE has a rather unfortunate design issue – it expects its search string to be a char* in-memory buffer. Scintilla doesn’t provide access to the text you are editing as a single memory buffer (quite rightly) and so this means there is a fundamental incompatibility between Scintilla and PCRE. I could of course simply retrieve the entire document into an extra memory buffer and run PCRE on that but it’s a very wasteful solution and not one that I’m willing to entertain.

Other libraries work in a much nicer way, using iterators. This allows you to define a custom iterator to walk over Scintilla’s data store thus neatly avoiding the need to provide a full buffer to the regex engine.

Other libraries to consider:

  1. Boost::Regex: PN already has a boost dependency so I don’t have a big issue with adding regex. There are two current issues:

    a. Boost::Regex supports Unicode expressions by using ICU. Bundling ICU will add at least 1Mb of code (I’m still building it to find a total) to the distribution. This is a lot compared to the rest of PN!

    b. Currently Boost::Regex does not support named groups. This is an important regex feature that PN makes use of to support arbitrary output matching.

  2. GRETA: A regular expressions library from Microsoft that has similar features to that of Boost. This also doesn’t support named groups, and doesn’t seem to have UTF-8 Unicode support either, relying on wchar_t which is no use to PN.

Others I’ve discarded due to lack of iterator support include the one built into ICU, oniguruma, and GNU regex.

Currently I’m not 100% decided which way to go. There is a Google SOC project to add named groups to Boost::Regex which would at least remove that block, leaving only the expensive Unicode support. Alternatively I could try to retrofit iterators to PCRE – something that sounds like a lot of hard work!

One way or another, PN will transition to full regex support in the editor.

p.s. In the comments Sebastian points out the highly useful Wikipedia article comparing regular expression engines. This would have saved me a bunch of time if I’d found it earlier!

Mar 07


MSDN Magazine points to an interesting lightweight Xml reader/writer framework for non-managed use called XmlLite.

It uses a COM interface (via IUnknown etc.) without the overhead of actually instantiating via COM (COM Lite). XmlLite is basically a port of XmlReader and XmlWriter from the .NET framework to native C++ – providing a significant speed boost over MSXML.

It’s available by default on Vista, or via download for XP and 2k3.

This post brought to you by the “clearing out my tabs” department…

Jan 07

XP “Visual Styles” Manifests in Visual Studio 2005

I’ve been doing some work in Visual Studio 2005 and it no longer supports embedding your XML manifest file through resources. Instead you have to specify some additional manifest dependencies – because of course that’s easier?!

I can never remember how to do this and rather than duplicate the information here, I’ll link to Jim Mathies’ helpful post: Manifest Dependencies in Visual Studio 2005

Aug 06

Visual C++ 2005 stringstream leak

I spent about five or six hours on Friday last week chasing down a memory leak that had appeared in some software at work that had previously been trouble free. I eventually tracked the leak down to a library used in multiple products that didn’t seem to leak elsewhere. Every time I used the library the application leaked 4k.

After a lot of building and testing, I tracked down the leak to a simple instantiation of the stringstream class from the standard library. This class leaks 4k every time it is instantiated due to, I believe, multiply inherited base classes both allocating the same buffer. This may also affect fstream as well.

The fix (for many use cases) is to replace stringstream with ostringstream which doesn’t leak.

As is so often the case, once I knew what was leaking I found references to it all over the Internet:

MSDN Forums

Len Holgate’s Blog

Microsoft Product Feedback

According to the product feedback site, this bug has been fixed for the next version (which hopefully includes SP1 due later this year).

Jul 06

Visual C++ 2005 Resource Editor Not Working

One of my guys at work had a problem with his new VS 2005 install, everything worked fine for compiling but he couldn’t actually edit his resources. For some reason he would get error: RC1107, and the resource editor would refuse to list or show resources of any type.

Bizarrely, the fix for the bug is to ensure that the last item in your VC++ include directories has an extra “\” character on it, as described here:

MSDN forums

That is one weird bug!

Jul 06

Visual Studio: Stop Building That Project!

I regularly work on a Visual Studio solution at work with 40+ individual projects in it. Irritatingly, many of them have begun to suffer from the “always rebuilt” problems that recent versions of Visual Studio (2002-2005) have shown. I decided to try and fix these things today and so here are a list of things to try that have helped me:

My project always builds, starting with MIDL

A few potential causes here:

  • If your IDL file imports another file (even in a #defined out section) that does not exist, this will cause VS to continually build the IDL file resulting in all dependent files and projects being built. Make sure included files exist.
  • If your IDL file imports files from another directory which import other files this can sometimes cause the dependency checker to get lost. Try adding your project directory to the include path.
  • For some reason, generating a type library (.tlb) file can be the cause of this. If you don’t need the type library, disable generation in the project properties (or .idl file properties). This is the one that solved the problem for me.

My project always links

I have had a number of projects that link every time even though nothing had changed, a combination of the following tips seems to solve the problems:

  • You can specify wildcards (e.g. *.lib) in the additional dependencies box for the linker. This seems to result in this being re-evaluated at every build. Point at the specific libs you need and you’ll solve this one.
  • Using forward slashes in the “Linker\Advanced\Import Library” option causes projects to need to re-link regularly. I have changed all of mine to standard backslashes and this seems to solve that problem! This tip may cover other directory paths as well – best thing is to ensure that all project settings paths use backslahes.
  • PDB file with a different name to the build output – if your .exe builds as myapp.exe and the PDB is myoldappname.pdb then the project will relink all the time, update the PDB filename to match the .exe!

Knowledge Base/Google Groups Articles

The following knowledge base articles, blog and Google Groups posts were vaguely useful in sorting these problems out:

Jul 06

Building DirectDraw projects with VS2005

DirectDraw has been deprecated by Microsoft and the headers are no longer included by default with Visual Studio 2005. If you get an error about being unable to find ddraw.h then you need to install the latest DirectX SDK (Dec 2005 or greater is compatible with VS 2005).

If you still have missing headers (perhaps “amvideo.h” from DirectShow) you then need to install an updated Platform SDK too.

Life was easier with VS 2003…

I hope this saves someone else some time hunting!