<presentation caption="Avoiding pitfalls in C extensions" bg="s2b.png">
  <style source="pitfalls.css" />
  <page bg="s2.png">
    <group>
      <title>Avoiding Pitfalls in C Extensions</title>
      <text> </text>
      <text>RubyConf.new(2007)</text>
      <text> </text>
      <text>Paul Brannan</text>
    </group>
  </page>
  
  <!--
  <page>
    <group>
      <heading>Avoiding Pitfalls in C Extensions</heading>
      <heading-separator/>
      <title>An observation game</title>
    </group>
    <group>
      <list>Watch closely</list>
      <list>Don't give away the secret!</list>
    </group>
    <group>
      <list>More of these games to come - in Ruby and C</list>
    </group>
  </page>
  -->

  <page>
    <group>
      <heading>Avoiding Pitfalls in C Extensions</heading>
      <title>Observation</title>
      <list>Werewolf</list>
      <list>Pair programming</list>
      <list>Pay attention to detail</list>
    </group>
  </page>

  <page>
    <group>
      <title>Two reasons for writing extensions</title>
    </group>
    <group>
      <list>Speed</list>
    </group>
    <group>
      <list>Interface with other languages</list>
    </group>
    <group>
      <list>Concepts that apply to one usually apply to the other</list>
    </group>
  </page>

  <page>
    <group>
      <title y="100">A simple example - adding two numbers</title>
    </group>
    <group>
      <code>
#include &lt;ruby.h&gt;

/* Add integers v1 and v2 and return the result */
static VALUE add(VALUE self, VALUE v1, VALUE v2)
{
  /* Convert our arguments */
  int i1 = NUM2INT(v1);
  int i2 = NUM2INT(v2);

  /* Do the math */
  int result = i1 + i2;
      </code>
    </group>
  </page>

  <page>
    <group>
      <title y="100">Adding two numbers (cont'd)</title>
      <code>
  /* Convert and return the result */
  return INT2NUM(result);
}

/* Initialize the library */
void Init_add()
{
  rb_define_global_function("add", add, 2);
}
      </code>
    </group>
  </page>

  <page>
    <group>
      <title y="100">Building the extension</title>
      <text y="180">extconf.rb:</text>
      <code>
require 'mkmf'
create_makefile('add')
      </code>
      <code>
$ ruby extconf.rb
creating Makefile
      </code>
    </group>
  </page>

  <page>
    <group>
      <title y="100">Building the extension</title>
      <text y="180">test.rb:</text>
      <code>
require 'add'
puts add(2, 3)
      </code>
      <code>
$ ruby test.rb
5
      </code>
    </group>
  </page>

  <!--
  <page>
    <group><title>A real-world example</title></group>
    <group><list>Let's build something boring</list></group>
    <group><list>Let's build...</list></group>
    <group><list>hello world in C!</list></group>
    <group><list>nah</list></group>
  </page>
  -->

  <page>
    <group><title>A real-world example</title></group>
    <group><list>Let's build something fun</list></group>
    <group><list>hello world in C!</list></group>
    <group><list>nah</list></group>
    <group><list>Let's build...</list></group>
    <group><list>A JIT compiler!</list></group>
  </page>

  <page>
    <group><title>Tools we'll need:</title></group>
    <group><list>Ruby</list></group>
    <group><list>C compiler</list></group>
    <group><list>libjit</list></group>
  </page>

  <page>
    <group><title>Adding numbers with libjit</title>
    <list>Create a function signature</list>
    <list>Create and lock a context</list>
    <list>Specify the instructions in the function body</list>
    <list>Compile the function</list>
    <list>Unlock the context</list>
    </group>
    <group><system method="start" cmd="cd e2; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>So that works...</title></group>
    <group><list>But it's not a JIT compiler yet</list></group>
    <group><list>We still need:</list></group>
    <group><sublist>A parser</sublist></group>
    <group><sublist>A script to compile and run our code</sublist></group>
    <group><sublist color="green">A wrapper for libjit - littlejit</sublist></group>
  </page>

  <page>
    <group><title>Littlejit - pieces we need</title></group>
    <group><list>Functions</list>
      <sublist list="circle">Create</sublist>
      <sublist>Append instructions</sublist>
      <sublist>Compile</sublist>
      <sublist>Call/apply</sublist>
    </group>
    <group><list>Values (variables)</list>
      <sublist>Result of an arithmetic instruction</sublist>
    </group>
    <group><list>Constants</list>
      <sublist>Literals, e.g. 1, 2, 3</sublist>
    </group>
  </page>

  <page>
    <group>
      <title>Littlejit - pieces we need</title>
      <list color="green">Functions</list>
      <sublist color="green">Create</sublist>
      <sublist>Append instructions</sublist>
      <sublist>Compile</sublist>
      <sublist>Call/apply</sublist>
      <list>Values (variables)</list>
      <sublist>Result of an arithmetic instruction</sublist>
      <list>Constants</list>
      <sublist>Literals, e.g. 1, 2, 3</sublist>
    </group>
    <group><system method="start" cmd="cd e3; xterm -fn 12x24"/></group>
  </page>

  <!--
  <page>
    <group><title>Creating functions</title></group>
    <group><list>Create the context, signature, etc.</list></group>
    <group><list>Call jit_function_create</list></group>
    <group><list>Wrap the function with Data_Wrap_Struct</list></group>
    <group><list>Return it to ruby</list></group>
    <group><system method="start" cmd="cd e3; xterm -fn 12x24"/></group>
  </page>
  -->

  <page>
    <group><title>Naming conventions</title></group>
    <group><list>Two variables with the same name?</list>
    <sublist>foo_v vs. foo</sublist>
    <sublist>v vs. pv (bignum.c)</sublist>
    <sublist>method vs. mdata (eval.c)</sublist>
    <sublist>dir vs. dp (dir.c)</sublist>
    <sublist>emitter vs. emitterPtr (syck/rubyext.c)</sublist>
    <sublist>obj vs. data (curses.c)</sublist>
    </group>
    <group><list>Personal preference</list></group>
    <group><list>Be consistent!</list></group>
  </page>

  <page>
    <group><title y="100">Data conversion</title></group>
    <group>
    <code>
  unsigned long num_args = NUM2ULONG(num_args_v);
</code>
</group>
    <group><list>Lots of integer conversions</list></group>
    <group>
      <sublist>NUM2ULONG</sublist>
      <sublist>NUM2UINT</sublist>
      <sublist>FIX2ULONG</sublist>
      <sublist>FIX2INT</sublist>
      <sublist>...</sublist>
    </group>
    <group><list>Consider:</list>
      <sublist>Valid ranges for your data types</sublist>
      <sublist>Valid ranges for your data</sublist>
    </group>
    <group><list>Prefer NUM2 macros since they do validation</list></group>
  </page>

  <page>
    <group>
      <title>Littlejit - pieces we need</title>
      <list color="green">Functions</list>
      <sublist color="lightblue">Create</sublist>
      <sublist>Append instructions</sublist>
      <sublist color="green">Compile</sublist>
      <sublist>Call/apply</sublist>
      <list>Values (variables)</list>
      <sublist>Result of an arithmetic instruction</sublist>
      <list>Constants</list>
      <sublist>Literals, e.g. 1, 2, 3</sublist>
    </group>
    <group><system method="start" cmd="cd e4; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group>
      <title>Exception safety</title>
      <list>Exception-safe code is better code</list>
      <list>Use rb_ensure or rb_protect to clean up resources in C extensions</list>
    </group>
  </page>

  <page>
    <group>
      <title>Check return values</title>
      <list>C APIs rarely use exceptions</list>
      <list>Check return values at every point and convert error codes to exceptions</list>
    </group>
    <group><system method="start" cmd="cd e5; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>Unit testing</title></group>
    <group><list>How do we unit test this code?</list></group>
    <group><list>Most C libraries aren't unit testable</list></group>
    <group><sublist>Inject a layer so we can use mock objects</sublist></group>
    <group><sublist>Don't bother - we have to write an integration test anyway</sublist></group>
    <group><sublist>Rewrite the library so it is unit testable</sublist></group>
  </page>

  <page>
    <group>
      <title>Littlejit - pieces we need</title>
      <list color="green">Functions</list>
      <sublist color="lightblue">Create</sublist>
      <sublist color="green">Append instructions</sublist>
      <sublist color="lightblue">Compile</sublist>
      <sublist>Call/apply</sublist>
      <list color="green">Values (variables)</list>
      <sublist>Result of an arithmetic instruction</sublist>
      <list>Constants</list>
      <sublist>Literals, e.g. 1, 2, 3</sublist>
    </group>
    <group><system method="start" cmd="cd e6; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group>
      <title>Ownership</title>
      <list>Critical to consider who owns a wrapped object</list>
      <list>If Ruby does not own the object, Ruby should not destroy it</list>
      <list>If a parent object owns the object and the parent obejct is owned by Ruby, the parent
        object should be marked</list>
    </group>
    <group><system method="start" cmd="cd e7; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group>
      <title>Littlejit - pieces we need</title>
      <list color="green">Functions</list>
      <sublist color="lightblue">Create</sublist>
      <sublist color="lightblue">Append instructions</sublist>
      <sublist color="lightblue">Compile</sublist>
      <sublist color="green">Call/apply</sublist>
      <list color="lightblue">Values (variables)</list>
      <sublist>Result of an arithmetic instruction</sublist>
      <list color="green">Constants</list>
      <sublist>Literals, e.g. 1, 2, 3</sublist> </group>
    <group><system method="start" cmd="cd e9; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group>
      <title>So that works...</title>
      <list>But it's not a JIT compiler yet</list>
      <list>We still need:</list>
      <sublist color="green">A parser</sublist>
      <sublist color="green">A script to compile and run our code</sublist>
      <sublist color="lightblue">A wrapper for libjit</sublist>
    </group>
    <group><system method="start" cmd="cd e9; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>Is it safe?</title></group>
    <group><list>Security concerns:</list></group>
    <group><sublist>Arbitrary code execution in high $SAFE levels</sublist></group>
    <group><sublist>Use of potentially harmful tainted data</sublist></group>
    <group><system method="start" cmd="cd e10; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>What have we learned?</title></group>
    <group><list>Consistency in naming</list></group>
    <group><list>Consistency in data types and conversions</list></group>
    <group><list>Check return values</list></group>
    <group><list>Exception safety</list></group>
    <group><list>Unit testing</list></group>
    <group><list>Object ownership</list></group>
    <group><list>Security</list></group>
  </page>

  <page>
    <group><title>Another example</title></group>
    <group><list>Stock symbol lookup application</list></group>
    <group><list>Using boost::shared_ptr</list></group>
    <group><code>typedef boost::shared_ptr&lt;Symbol_Info&gt;</code>
      <code>  Symbol_Info_shptr;</code></group>
    <group><system method="start" cmd="cd s1; xterm -fn 12x24"/></group>
    <group><list>Consider ownership</list></group>
    <group><sublist>boost::shared_ptr owns the object</sublist></group>
    <group><sublist>Ruby owns the shared pointer</sublist></group>
    <!-- TODO: diagram -->
  </page>

  <page>
    <group><title>Exception conversion</title></group>
    <group><list>Simple rules:</list></group>
    <group><sublist>Convert C++ exceptions to Ruby exceptions at C++/Ruby
        boundaries</sublist></group>
    <group><sublist>Convert Ruby exceptions to C++ exceptions at Ruby/C++
        boundaries</sublist></group>
    <group><system method="start" cmd="cd s2; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>A database</title></group>
    <group><list>Use Pstore - it's simple</list></group>
    <group><system method="start" cmd="cd s3; xterm -fn 12x24"/></group>
    <group><list>Need marshal_dump/marshal_load</list></group>
    <group><system method="start" cmd="cd s4; xterm -fn 12x24"/></group>
    <group><system method="start" cmd="cd s5; xterm -fn 12x24"/></group>
  </page>

  <page>
    <group><title>Oft-forgotten methods in C extensions</title></group>
    <group><list>marshal_dump/marshal_load</list></group>
    <group><list>dup/clone</list></group>
    <group><list>to_s/inspect</list></group>
  </page>

  <!--
  <page>
    <group><title>Callbacks</title></group>
  </page>

  <page>
    <group><title>Inheritance</title></group>
  </page>
  -->

  <page>
    <group><title>Extensions - things that can go wrong</title></group>
    <group><list>Conversions</list></group>
    <group><list>Exceptions</list></group>
    <group><list>Incorrect object initialization</list></group>
    <group><list>Inheritance</list>
      <sublist>Pointer to base vs. pointer to derived</sublist>
    </group>
    <!-- TODO: diagram -->
    <group><list>Incorrect use of Ruby API</list></group>
  </page>

  <page>
    <group><title>Tools</title></group>
    <group><list>SWIG</list>
    <sublist>Simplified Wrapper Interface Generator</sublist></group>
    <group><list>Rice</list>
    <sublist>Ruby Interface for C++ Extensions</sublist>
  <sublist>like boost++python</sublist></group>
  </page>

  <page>
    <group><title>Rice and integer conversions</title></group>
    <group><code>
VALUE x_v = ULONG2NUM(x);
Object y_v = to_ruby(y);

unsigned long x = NUM2ULONG(x_v);
unsigned long y = from_ruby&lt;unsigned long&gt;(y_v);
    </code></group>
  </page>

  <page>
    <group><title>Arrays in Rice</title>
<code>
    Array a;
    a.push(42);
    a[0] = 10;
    std::cout &lt;&lt; a[0] &lt;&lt; std::end; // => 10
</code>
</group>
  </page>

  <page>
    <group><title>Defining classes in Rice</title>
<code>
class Animal : public Organism
{   
public:
  virtual ~Animal() = 0;
  virtual char const * speak() = 0;
};
</code>
</group>
</page>

  <page>
    <group><title>Defining classes in Rice (cont'd)</title>
<code>
class Dog : public Animal
{
public:
  virtual ~Dog() { }
  virtual char const * speak()
  {
    return "Woof woof";
  }
};
</code></group>
  </page>

  <page>
    <group><title>Defining classes in Rice (cont'd)</title>
<code>
define_class&lt;Animal&gt;("Animal")
  .define_method("speak", &amp;Animal::speak);

define_class&lt;Dog, Animal&gt;("Dog")
  .define_constructor(Constructor&lt;Dog&gt;());
</code></group>
  </page>

  <page>
    <group><title>Final thoughts</title></group>
    <group><list>Be observant</list></group>
    <group><list>Pay attention to detail</list></group>
    <group><list>Think about:</list></group>
    <group><sublist>Exceptions and edge cases</sublist></group>
    <group><sublist>Object ownership</sublist></group>
    <group><sublist>Testing</sublist></group>
    <group><list>Use a tool</list></group>
    <group><text> </text>
      <list>Questions?></list>
    </group>
  </page>

  <!-- TODO:
    * thread safety
    * more type safety
    * api conversion
    * more exception safety
    * inheritance
    * more unit testing
    * callbacks
    * more data serialization
  -->

</presentation>

<!-- vim:tw=100
-->

