c++

Private private members

Private private members

Visualization of the concept in “Breaking up is hard to do“.

GNU sizeof, how odd

Apparently, GCC doesn’t like the following second version of this simple piece of code:

size_t sizeInt = sizeof(int);    // Compiles on GCC, ICC and VS.
size_t sizeInt = sizeof((int));  // Compiles on ICC and VS.

GCC complains:

expected primary-expression before ‘int’

Sigh.

Breaking up is hard to do (to member functions)

Want to bounce this off some other coders before I make a language-change proposal of it:

C#’s “partial” class type lets you spread a class definition across multiple files. But mostly people use it as a way to make private code definitions private – split large functions up into little nuggets that don’t get exposed in the primary API while still having membership status to access member variables, etc.

The downside is: it actually lets you add crap to the class in your own definitions, forming part of the need for the “sealed” accessor type in C#.

In C++ your choice is (a) huge member functions, (b) Pimpls, (c) classes that list zillions of private members in their public API.

Seems to me, a simple combination of existing keywords would let you create non-public, compilation-unit scoped temporary member functions so you can break a large member function up without all the hassles that come from breaking a member function into private local non-member functions.

Specifically the term “private”.

By marking them this way, the following constraints/restrictions/behaviors would naturally fall out of the current language definition:

  • Cannot be virtual (if it goes in the vtable, it goes in the ‘class’ def,
  • Cannot be abstract (since you’d have to declare it in the class),
  • Cannot share a name with a well-defined member function or variable,
  • Not visible to derived classes even if they are in the same compilation unit, (as applies to any ‘private’ member)
  • Cannot be an operator, (that could lead to some nightmare situations)
  • Cannot be static (static members aren’t limited to ‘public’ access, so there are actually good use cases for ‘private static’)
/// Header file Foo.h
class Foo {
public:
  Foo() : m_bar(“”), m_i(0) {}
  void bigFunction();
  void otherFunction();
private:
  void privateFunction();
};

 

/// Foo1.cpp
#include “Foo.h”

private void Foo::helper1() {
  m_bar = “hello”;
  m_i = 1;
}

private inline void Foo::helper42() const; // prototype variant.

void Foo::bigFunction() {
  helper1();      // member call
  …
  helper42();	// member call
}

void Foo::helper42() const {
  m_bar = “world”;
  m_i = 42;
}

class Bar : public Foo {
  …
  void barMember() {
    helper42(); // Error: it was private to Foo.
  }
};

 

/// Foo2.cpp
#include “Foo.h”

private void Foo::helper1() {
  // Legal, because helper1() was private-unit-scoped in Foo1.cpp and is thus not visible here.
}

private void Foo::privateFunction(const char* const message) {
  // ERROR: The finger print is different but the name conflicts with an established
  // ERROR: member function name, so this isn't legal.
}

void Foo::otherFunction() {
  helper1();	// Invokes the helper1() declared in Foo2.cpp above.
  …
  helper42();	// Error: helper42 is only unit-visible in CPP file #1.
}

Spot the flaws


void someFunction(char* inputStr)
{
char buffer[8];

int n = snprintf(buffer, sizeof(buffer) - 1, inputStr);
buffer[n] = 0;

/* ... */

And while you’re at it – see if you can spot the motivations behind what’s being done.

 

Do Not Do


a[n++] = b[++n];

Hint: If you know what it does, you’re wrong.

 

Less-than optimization

Constraining values to small 0 <= N <= limit is something that many compilers now optimize for you to reduce the number of comparisons and to eliminate the branch implied by the “also”:

bool checkConstraint(/*signed*/ int i)
{
    return ( i >= 0 && i <= 10 ) ;
}

bool muchFasterCheckConstraint(/*signed*/ int i)
{
    return ( (unsigned int)i <= (unsigned int)10 ) ;
}

Memory mapping files

Every now and again I dig up my old MUD language (AMUL /SMUGL) and tinker with the source code. Some time last year I used it to explore various optimization/profiling tools and found a large portion of the compilation process was taken up with simple disk IO, and almost all of it on reads: I’d found myself an excuse to experiment with mmap().

I quickly found that while Windows doesn’t support mmap() but it provides its own, in some ways superior, MapViewOfFile. Ultimately, both systems return you a pointer to address space where the file’s contents will magically appear in memory for you without needing to call read() etc.

I was pleasantly surprised by how easy it was to use both systems, and they are similar enough that I was able to do so while building a simple “MappedFile” C++ class wrapper for the process. For source, see http://www.kfs.org/oliver/code/io_mapped_file/ – there’s also a Linux-based mmap() vs read() comparison, and a poor-man’s grep/find example app.

Why are enums so tedious?

It’s long, long past time for C/C++ to have some automated way to reflect enums and/or build enums from strings.


enum class ProductType : unsigned char {  Cheese, Whine, Max } ;

const char[ProductType::Max] ProductTypeName = {  "Whine", "Cheese" } ;

struct ProductTypeInfo { ProductType type, const char* name, size_t minOrder, size_t maxOrder, Branch factory } ;
ProductTypeInfo productTypeInfo[ProductType::Max] =
{
  { Whine, "Whine", 1, 10, Branch::Bordeaux }
, { Cheese, "Cheese", 1, 50, Branch::Cheshire }
};
switch ( productType )
{
case ProductType::Cheese: testForCheddar() ; break ;
case ProductType::Whine: hiccup() ; break ;
/* Compiler warning: No test for 'Max' */
}

ARRRRGGHHH! SO MUCH DUPLICATION OF DATA! (And I hope you spotted the [most egregious] mistake).

Stroustrup and the C++ community are right, which proves them wrong.

When Bjarnes Stroustroup designed the C++ class concept, he made the default accessibility “private”, so as to encourage encapsulation and data hiding.

That’s as far as he went, immediately violating his own principle.

The more redeeming features of C++11

C++11 is the now ratified and gradually being implemented C++ standard that finally solidifed last year after far far too many years in design. The process by which C++11 was finally settled on leaves me thinking that the people on the committee are far, far too divorced from day-to-day C++ usage and the needs of the average software developer. I’m not talking about what they finally gave us, but how they reached the decision on what would be included, how it would be phrased, and so forth.

Some of the features, especially lambdas, are going to be a real pain in the ass when they begin being commonly deployed, introducing write-once-scratch-head-forever crap into the language, and half-assed fixing solutions with copious obfuscation for lesser issues of the same source problem (virtual ‘override’).

But somehow, someway, somefolks got some good stuff into C++11 that I think I will personally benefit from.

auto

While I may not be happy about the word itself, the “auto” feature is very nice.


// Before
typedef std::map< std::string, std::set<std::string> > StringSetMap ;
StringSetMap ssm ;
for ( StringSetMap::iterator it = ssm.begin() ; it != ssm.end() ; ++it )

// After
std::map< std::string, std::set<std::string> > ssm ;
for ( auto it = ssm.begin() ; it != ssm.end() ; ++it )

// Before
Host::Player::State* playerState = (Host::Player::State*)calloc(1, sizeof(Host::Player::State)) ;

// After
auto playerState = (Host::Player::State*)calloc(1, sizeof(Host::Player::State)) ;

range-based for

One of the big concerns of the C++ standards committee was breaking pre-existing code. As a result, awful choices like naming auto “auto” were made. range-based for is another case of “I’d rather teach people visual basic than explain C++11 range-based for to them”, but in practice it’s really nice:


// Before

std::map<unsigned int, std::string> myMap ;
// ...
for ( std::map<unsigned int, std::string>::iterator it = myMap.begin() ; it != myMap.end() ; ++it )
{
const unsigned int i = it->first ;
  const std::string& str = it->second ;
// ...
}

// After

std::vector<unsigned int> myUints ;
// ...
for ( auto it : myUints )
{
const unsigned int i = it.first ;
  const std::string& str = it.second ;
  // ...
}

One thing I’m not clear on, with range-based-for is whether it copes with iterator validity, e.g. what happens (yeah, I know, I could try it, duh) if you do

for ( auto it : myMap )
{
if ( it.first == 0 )
myMap.erase(it) ;
}

extended enum definitions

In the beginning, there were names. And the Stroustrop said, Let there be name spaces, that separate the code from the library. And he named the standard library “std::” and the rest “::”. And it was about bloody time.

Unfortunately, “enum”s slipped thru the cracks. In C and C++ I find to enums be something of a red-headed stepchild, lacking a few really, really important features that programmers always end up resorting to sloppy bad practices to work around.

In C++3 (before C++0x/C++11) there was no way to pre-declare them. If you wanted to declare a function prototype that accepted a LocalizationStringID enumeration, that mean’t including the whole bloody list of LocalizationStringIDs too.

Compounding this issue is the fact that enums are exposed in the scope they are declared in, so they generally pollute namespaces, so being forced to continually include is a real pain in the butt.

It also means you have to remember the special prefix that every enum list uses, because in order to hide stuff, people tend to make their enum names long.

But the compiler couldn’t, otherwise, tell what sort of variable was going to be needed for the enum, and there was no way to specify it. Especially when you’re dealing with networking, this is an abject pain in the ass because it’s a variable who’s type you don’t control in a language without reflection there is no way to find out what it has been given, in a situation where you care a great deal about exactly how the data is stored.

I’ll get to my other enum issues after I touch on what C++11 did do for enums.


// Before

// Localization string identifiers.
// Try to keep these in-sync with the localization database, please.
typedef enum LSTRING_ID {
LSTR_NONE  // No message,
, LSTR_NOTE  // NOTE: prefix
...
, LSTR_SPILLED_BEER_ON_KEYBOARD // = #6100 as of 11/21/09
...
, LSTR_PREFIX_LINE = -1  // More text to follow.
// ^- this may cause the compiler to use signed storage for the enum,
// or the compiler might always use signed storage for enums,
// or the compiler might never use signed storage.
// Either way, it's going to lead to some interesting type-pun errors.
};

extern void sendLString(playerid_t /*toPlayer*/, int /*lstringID*/) ;
extern void sendLStringID(playerid_t /*toPlayer*/, LSTRING_ID /*lstringID*/);
// ...
LSTRING_ID x = 90210 ; // Valid but bad.
sendLString(pid, LSTR_NONE) ; // Valid but bad practice.
sendLString(pid, 99999) ; // Valid but not good.
sendLStringID(pid, -100) ; // Valid but not good.
sendLStringID(pid, Client::Graphics::RenderType::OpenGL) ; // Valid but OMFG LTC&P MORON!
// Last but not least ...
sendLStringID(pid, LSTR_SPILLED_BEER_ON_KEYBOARD) ;

C++11 cleans up on enums big time. Firstly with enum class, and here, I think hats off to the committee for coming up with a rather nice syntax although they had to fudge it to avoid what I find a dumb caveat of the C++ class definition :)

An enum class is a class-like namespace, complete with type safety, that contains an enumeration. Borrowing further from the class definition, it allows you to specify a base class which will be the underlying type used for the enumeration. If omitted, though, it’ll use the good old I-dunno-wtf-type-that-is enum type.


enum class LStringIds : signed short
{
PrefixLine = -1
, None = 0
, Note
...
, SpilledBeerOnKeyboard

};

sendLStringID(playerID, LStringIds::SpilledBeerOnKeyboard) ;

By using this class mechanism, you can also pre-declared enums now:


enum class LSTRING_ID : short ; // VS11 Beta doesn't support this as of 2012-03-03, though it says it does.

GCC 4.6 supports this, and it reduced game-server compilation time by about 8%. RAR!

We can also use nice names for the LString IDs now without worrying about naming clashes.

I would like to point out that anyone who knows C++ should have spotted that there is no access type specification,


enum class Foo : public unsigned char { public: Fred ... } ;

It makes no sense that the names in an enum class be private. But then it also makes no sense that the stuff in a class be private by default.

Note: There is also “enum struct” … which as far as I understand is exactly the same, it’s just less likely to make you forget to put “public” at the top of your next real class declaration :)

I’m going to write a separate post about my other gripes with enums :)

time

Not many people know this, but computers are utterly shit at keeping track of time. The hardware involved is a joke, and because of various historical bugs in each operating systems’ time keeping routines, it’s really bloody messy and expensive to get time in meaningful terms, never mind portable.

For example, under Windows you have to pratt about with QueryPerformanceCounter, and then you have to make sure to check for time going backwards or leaping forwards, and stuff.

C++11 doesn’t fix that, but it helps by providing functions to deal with a lot of this stuff in the standard library via the <chrono> header. Yay! The template origamists really came out and did their thing, there are some really nice features involved there, including the ability to create timing variables that include their quanta in their compile-time type information so that the compiler can do smart stuff like working out that you’re comparing seconds with minutes and what it needs to do to handle that situation…

parallelism

Oh, god, yes.