c++

Ready to plus it up – a little

The foray into raw C has been fun, but there are three things that I miss: references from C++, multiple-return types from Go et al, and some variation of function-to-structure association.

References save a lot of null checking, and ‘.’ is two less keypresses than ‘->’.

Multiple return types lets you tackle the oddity of “pass me the address of the thing I’m going to give you”

// what I mean:
read(char* into, size_t limit) returns (bool success, size_t bytesRead)

// what c libraries do:
ssize_t /* -1=err */ read(char *into, size_t limit)

// what I do:
error_t read(char *into, size_t limit, size_t *bytesReadp)

But that pointer on the right means an input parameter to my function instead of the output I want to denote.

Methods give you name spacing, first and foremost, but they also just provide a logical separation I find I’ve missed in going back to C.

In particular, Go’s method implementation really appeals to me.

Go doesn’t have the complication of header files, so unlike the C+… family of languages, you aren’t bogged down with boiler-plate and it also didn’t feel the need to convolute structure definitions with pre-declarations of implementations of functions.

When viewed thru a C++, Java, C# etc lense, the following will make your skin crawl:

typedef Buffer struct {
    data     []byte
    cursor   int
}

func (b *Buffer) Read(into []byte) (bytesRead int, err error) {
    ....
}

On the left, (b *Buffer) is the “receiver” type, it’s how Read is associated with Buffer. On the right are the parameters, into, and after that is a list of the return values.

How are you supposed to know what methods a class has? That’s probably part of what made your skin itch.

The answer is: Probably not by going and reading header files or code. Go has incredibly powerful tooling that makes it a doozie to integrate with IDEs and editors. It’s sickeningly fast to parse, and documentation is generally highly accessible.

I’ve about gotten the AMUL code to a place where I’m ready to think about trying language only C++ – that is, C++ language features but none of the STL.

I did a thing

A about a month ago, I dug up an old copy of my MUD language from 1990, and took a crack at refactoring some of it in C++, but I quickly got frustrated with just how much that felt like work rather than fun, especially after my recent foray into golang.

So I have this 29+ year old code, written in original K&R ANSI C, and I want to get it, say, at least compiling.

Why not take a shot at doing it in pure C?

CPPCon 2017

I love and hate conventions, so I don’t go to them all that often.

Although I’ve watched CPPCon videos, I hadn’t considered something you attended until this year; I wasn’t really convinced it would be worth going.

The agenda for the first few days proposed some very interesting stuff, and I decided to dip my toe.

Private private members

Private private members

Visualization of the concept in “Breaking up is hard to do“.

GNU sizeof, how odd

Apparently, GCC doesn’t like the following second version of this simple piece of code:

size_t sizeInt = sizeof(int);    // Compiles on GCC, ICC and VS.
size_t sizeInt = sizeof((int));  // Compiles on ICC and VS.

GCC complains:

expected primary-expression before ‘int’

Sigh.

Breaking up is hard to do (to member functions)

Want to bounce this off some other coders before I make a language-change proposal of it:

C#’s “partial” class type lets you spread a class definition across multiple files. But mostly people use it as a way to make private code definitions private – split large functions up into little nuggets that don’t get exposed in the primary API while still having membership status to access member variables, etc.

The downside is: it actually lets you add crap to the class in your own definitions, forming part of the need for the “sealed” accessor type in C#.

In C++ your choice is (a) huge member functions, (b) Pimpls, (c) classes that list zillions of private members in their public API.

Seems to me, a simple combination of existing keywords would let you create non-public, compilation-unit scoped temporary member functions so you can break a large member function up without all the hassles that come from breaking a member function into private local non-member functions.

Specifically the term “private”.

By marking them this way, the following constraints/restrictions/behaviors would naturally fall out of the current language definition:

  • Cannot be virtual (if it goes in the vtable, it goes in the ‘class’ def,
  • Cannot be abstract (since you’d have to declare it in the class),
  • Cannot share a name with a well-defined member function or variable,
  • Not visible to derived classes even if they are in the same compilation unit, (as applies to any ‘private’ member)
  • Cannot be an operator, (that could lead to some nightmare situations)
  • Cannot be static (static members aren’t limited to ‘public’ access, so there are actually good use cases for ‘private static’)
/// Header file Foo.h
class Foo {
public:
  Foo() : m_bar(“”), m_i(0) {}
  void bigFunction();
  void otherFunction();
private:
  void privateFunction();
};

 

/// Foo1.cpp
#include “Foo.h”

private void Foo::helper1() {
  m_bar = “hello”;
  m_i = 1;
}

private inline void Foo::helper42() const; // prototype variant.

void Foo::bigFunction() {
  helper1();      // member call
  …
  helper42();	// member call
}

void Foo::helper42() const {
  m_bar = “world”;
  m_i = 42;
}

class Bar : public Foo {
  …
  void barMember() {
    helper42(); // Error: it was private to Foo.
  }
};

 

/// Foo2.cpp
#include “Foo.h”

private void Foo::helper1() {
  // Legal, because helper1() was private-unit-scoped in Foo1.cpp and is thus not visible here.
}

private void Foo::privateFunction(const char* const message) {
  // ERROR: The finger print is different but the name conflicts with an established
  // ERROR: member function name, so this isn't legal.
}

void Foo::otherFunction() {
  helper1();	// Invokes the helper1() declared in Foo2.cpp above.
  …
  helper42();	// Error: helper42 is only unit-visible in CPP file #1.
}

Spot the flaws


void someFunction(char* inputStr)
{
char buffer[8];

int n = snprintf(buffer, sizeof(buffer) - 1, inputStr);
buffer[n] = 0;

/* ... */

And while you’re at it – see if you can spot the motivations behind what’s being done.

 

Do Not Do


a[n++] = b[++n];

Hint: If you know what it does, you’re wrong.

 

Less-than optimization

Constraining values to small 0 <= N <= limit is something that many compilers now optimize for you to reduce the number of comparisons and to eliminate the branch implied by the “also”:

bool checkConstraint(/*signed*/ int i)
{
    return ( i >= 0 && i <= 10 ) ;
}

bool muchFasterCheckConstraint(/*signed*/ int i)
{
    return ( (unsigned int)i <= (unsigned int)10 ) ;
}

Memory mapping files

Every now and again I dig up my old MUD language (AMUL /SMUGL) and tinker with the source code. Some time last year I used it to explore various optimization/profiling tools and found a large portion of the compilation process was taken up with simple disk IO, and almost all of it on reads: I’d found myself an excuse to experiment with mmap().

I quickly found that while Windows doesn’t support mmap() but it provides its own, in some ways superior, MapViewOfFile. Ultimately, both systems return you a pointer to address space where the file’s contents will magically appear in memory for you without needing to call read() etc.

I was pleasantly surprised by how easy it was to use both systems, and they are similar enough that I was able to do so while building a simple “MappedFile” C++ class wrapper for the process. For source, see http://www.kfs.org/oliver/code/io_mapped_file/ – there’s also a Linux-based mmap() vs read() comparison, and a poor-man’s grep/find example app.