One Div Zerohttp://james-iry.blogspot.com/An exploration of software development.PyRSS2Gen-1.0.0http://blogs.law.harvard.edu/tech/rssWhen CONSTants Varyhttp://james-iry.blogspot.com/2010/07/when-constants-vary.html<div class="post-body entry-content"> <p>/*</p><p>Occasionally somebody will suggest that language X would be better off with C++'s "const", a type qualifer that supposedly makes objects immutable in certain contexts[1]. I'm not convinced. Const requires a fair amount of work and provides fairly small guarantees. This post is a bit of literate C++ program demonstrating some of the weaknesses in const.</p><p>There are a couple of holes that I want to dismiss quickly. First there's the weakness that const can be cast away. But hey, the C and C++ motto is "casting away safety since 1972." Blowing your leg off with a cast is half the fun of using the C derived languages. Second, there's the useful, if slightly misnamed, weakness of the "mutable"[2] keyword which can be applied to a field and says "this field can be mutated even in const contexts." Mutable is useful for things like memoizing.</p><p>But even ignoring those very deliberate issues, there are still a couple of fundamental weaknesses in how C++ deals with const.</p><p>For my example I'm going to use a simple toy class, a Counter that can be incremented (mutating its state) and then queried to see what the current count is.</p><p>*/</p><pre> #include &lt;iostream&gt; using namespace std; class Counter { int _count; public: Counter() : _count(0) {} void inc() { _count++; } </pre> <p>/*</p><p>A perfectly good use of const. The count() query doesn't change the state of this Counter.</p><p>*/</p><pre> int count() const{ return _count; } }; </pre> <p>/*</p><h3>Indirection</h3><p>So far, so good. But we can easily destroy const with a bit of indirection. Imagine I need something that holds a pair of Counters and, for whatever reason, I need to do so via pointers.</p><p>*/</p><pre> class CounterPair { Counter *_c1; Counter *_c2; public: CounterPair() { _c1 = new Counter(); _c2 = new Counter(); } ~CounterPair() { delete _c1; delete _c2; } </pre><p>/*</p><p>These two methods query the state of the CounterPair and are safely const</p><p>*/</p><pre> int count1() const { return _c1 -> count(); } int count2() const { return _c2 -> count(); } </pre><p>/*</p><p>Up to this point everything is golden. But these next two methods modify the state of the CounterPair, yet are marked const.</p><p>*/</p><pre> void inc1() const { _c1 -> inc(); } void inc2() const { _c2 -> inc(); } }; </pre><p>/*</p><p>What gives? Well, C++ doesn't appear to understand the that state and memory are two different things. I didn't modify the memory that lies directly in one CounterPair envelope but I'm clearly modifying the state of the CounterPair. This is one area where the D programming language <a href="http://www.digitalmars.com/d/2.0/const-faq.html#transitive-const">does much better</a> than C++.</p><h3>Aliasing</h3><p>An even more subtle issue happens with aliasing. This function does not modify its first argument but does modify its second. The first is marked const.</p><p>*/</p><pre> void incSecondCounter(Counter const &amp;c1, Counter &amp;c2) { c2.inc(); } int main(int argc, char** argv) { Counter c; </pre><p>/*</p><p>But if we create a bit of aliasing then c1's referent gets modified</p><p>*/</p><pre> Counter const &amp;c1 = c; Counter &amp;c2 = c; cout << c1.count() << endl; incSecondCounter(c1, c2); cout << c1.count() << endl; }</pre><p>/*</p><p>In this toy program that looks silly, but in larger programs aliasing can be quite subtle. And aliasing is a tricky, tricky problem to deal with well at the type level so even the D language will have an equivalent hole in const. At best what const says here is that "the object won't be modified via that particular reference," which is a heck of a lot weaker than "the object won't be modified."</p><h3>Conclusion</h3><p>Const correctness requires a lot of typing of both the push-the-buttons-on-the-keyboard kind and the keep-certain-static-properties-consistent variety. Yet it gives back relatively little. Even without poking holes in const using casting or "mutable" it's easy to get mutation in something that's supposed to be immutable. C++'s variety of const doesn't deal with indirection at all well and aliasing is a tricky nut that would substantially alter a language's type system. Is const worth it? I'm not sure.</p><h4>Footnotes</h4><ol><li>C'99 also has a const keyword with very similar semantics. So if you replace references with pointers and strip out the thin veneer of OO this article would cover C as well.</li><li>Misnamed because fields are generally mutable even without the keyword. But I suspect the committee would balk at "mutable_even_under_const."</li></ol><p>*/</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-9120467841801838459Tue, 20 Jul 2010 12:46:15 GMTCode Monkeyism's Post Is Unfit For Serious Readinghttp://james-iry.blogspot.com/2010/07/code-monkeyisms-post-is-unfit-for.html<div class="post-body entry-content"> <p>Stephan Schmidt makes several points in <a href="http://codemonkeyism.com/scala-unfit-development/">Scala Is Unfit For Serious Development</a> . Some are even valid. Unfortunately, the valid points are buried in over-generalizations and attacks on the attitudes of the developers and community.</p><p><em>It's unfit because the developers and the community are unwilling.</em></p><p>At the end of his little diatribe he thanks 2 members of the community for helping him with his problems. Sounds pretty willing to me.</p><p> <em>1. The developers are interested in language research and writing papers, not in making a language for real development</em></p><p>So far from true. Yes they write papers. But real development is very much a goal of Scala. If it weren't then Java interoperability would be ditched yesterday. Java interop is a giant albatross around Scala's theoretical neck. Some examples of how Java interop fights Scala: null sucks, the jvm doesn't do general tail calls, structural types are hacked in using reflection, and traits/mixins require a rube-goldberg copy mechanism. Java interop is only a practical concern for real programmers who want to get stuff down now with existing libraries.</p><p><em>2. The version hell that is 2.8 - calling this version 2.8.0 is deceptive. As Python (3000) or Ruby (2.0) this should have been a major version called 3.0</em></p><p>Some truth. Ruby made nasty breaking changes in 1.9.1 and Python isn't immune to them. None-the-less, it really should be called Scala 3.0 to communicate the extent of the changes.</p><p><em>3. The developers do not care about binary compatibility, deprecated, soft migration or API compatibility. Things are changed on a whim - from one RC to the next, and they change major parts of the language/libraries in point releases.</em></p><p>There is a significant technical challenge to maintain binary compatibility given the fact that the JVM has no multiple-inheritance mechanism. But that shouldn't be confused with a lack of caring. Martin Odersky has promised that the next major release will focus on solving the binary compatibility issue once and for all. Deprecation - could have been done better. Definitely. Then again, Schmidt is playing with beta software where API INcompatibility are pretty much guaranteed.</p><p><em>4. They remove and add classes just as they see fit.</em></p><p>A repeat of point 3.</p><p><em>5. The community loves cutting edge. Instead of fixing bugs they upgrade to all new released RCs - so have you. It's called cutting edge for a reason: You'll cut yourself. And it's called bleeding edge because you will bleed.</em></p><p>An overgeneralization. "The community" is apparently represented by the library causing Schmidt his problem. Other developers seem to be more careful. lift, for instance, has been very careful about it's migration policy and just released version 2.0 against the stable version of Scala. </p><h3>Conlclusion</h3><p>Scala's binary compatibility sucks due to a real technical challenge interoperating efficiently with the JVM. The 2.8 series is really 3.0. Deprecation could be done better. One library that Schmidt used decided to fix a bug and compile against a beta version of Scala. Schmidt jumped into the beta version. A small bit of versioning hell ensued. From this Schmidt concluded that the Scala development team do not care and that Scala is unfit for development.</p><p>In the meantime, back in reality, the small Scala development team is continuing to move forward with a great language. The community continues to put out awesome libraries. Businesses like Novell, Twitter, and FourSquare continue to make bets on Scala.</p><p>Is Scala flawed. Hell yes. But if you think it's a great language then contribute. Help make it and it's ecosystem better. Help the developers of libraries maintain bug fixes on stable branches, or help test for breaking changes and push for a deprecation model, or contribute constructively to the binary compatibility challenge. Don't take potshots against the developers from the sidelines.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-9152348724635748102Fri, 09 Jul 2010 09:50:25 GMTWho Will Throw the Hammer This Time?http://james-iry.blogspot.com/2010/05/who-will-throw-hammer-this-time.html<div class="post-body entry-content"> <img src="http://www.pogofish.com/whowillthrowthehammer.png" alt="Steve Jobs 1984 Who will throw the hammer this time?" /> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-7041142113636747263Thu, 27 May 2010 07:00:34 GMTAnatomy of an Annoyancehttp://james-iry.blogspot.com/2010/05/anatomy-of-annoyance.html<div class="post-body entry-content"> <p>The warning "Type safety : A generic array of T is created for a varargs parameter" is so irritating it makes me want to beat Java with a snoracle, an implement that allows defunct tech companies to breath while underwater. This article will explain what the warning is, where it comes from, and why it is pretty boneheaded. The warning also shows just how hard it is to extend an existing language without creating pain for your users.</p><p>My sample use case: anytime you build a Map in Java it takes approximately 6 metric tons of boilerplate.</p><pre> Map&lt;Integer, String&gt; numbers = new HashMap&lt;Integer, String&gt;(3); numbers.put(1, "one"); numbers.put(2, "two"); numbers.put(3, "three"); // etc </pre><p>It would sure be nice if that could be done in one line of code as in most sane languages. So let's fix it! First, we need the ubiquitous Pair class[1].</p><pre> public static class Pair&lt;A, B&gt; { private A first; private B second; public Pair(A first, B second) { this.first = first; this.second = second; } public A getFirst() { return first; } public B getSecond() { return second; } /* a real Pair class would also have equals and hashCode, but for this code I don't need them */ } </pre><p>And then, to make it more pleasant to use, we write a static function that can take advantage of Java's half-assed type inference.</p><pre> public static &lt;A, B&gt; Pair&lt;A, B&gt; pair(A first, B second) { return new Pair&lt;A, B&gt;(first, second); } </pre><p>Finally, the <em>pièce de résistance</em></p><pre> public static &lt;A, B&gt; Map&lt;A, B&gt; map( Pair&lt;A, B&gt;... pairs) { Map&lt;A, B&gt; map = new HashMap&lt;A, B&gt;(pairs.length); for(Pair&lt;A, B&gt; pair : pairs) { map.put(pair.getFirst(), pair.getSecond()); } return map; } </pre><p>And so, with faces smiling as the <del>sun</del>oracle shines down upon us, we use it</p><pre> Map&lt;Integer, String&gt; numbers = map(pair(1, "one"), pair(2, "two"), pair(3, "three")); </pre><p>Ah, sweet sweet beauty[2]. Except one nasty, tiny, shoot-me-now-and-bury-my-carcass-in-downtown-Santa-Clara annoyance. That single lovely line of code that so innocently tried to use all our wonderful machinery is tagged with the warning "Type safety : A generic array of T is created for a varargs parameter." And the only way to "fix" it is to nuke the whole area from orbit with '@SuppressWarnings("unchecked"),' possibly hiding real problems in the same method. Gah.</p><h3>History, Part The First</h3><p>To understand the problem we have to go way back in history, at least as far as Java 1.0. Maybe even to Oak or to the egg and sperm that fused to make James Gosling. Whatever. In designing Java there was a question: since String is a subtype of Object shouldn't String[] be a subtype of Object[]. After all sometimes that's useful: if I have an array of Strings and you want to write a function that iterates through an array of Objects, aren't we good if I send you my Strings?</p><p>Well, no, we're not good. We're very bad. If String[] is a subtype of Object[] then Bad Things Happen&trade;</p><pre> void badThing(Object[] objects) { objects[0] = new Integer(42); } void test() { String[] strings = {"a", "b", "c"}; badThing(strings); // length of an integer? System.out.println(strings[0].length()); } </pre><p>If the above were allowed then we'd be trying to find the length of an Integer and that, um, won't work. Reading is okay but writing is bad. So to solve the problem the original Java designers made a pact with Satan. Instead of statically preventing code like that they added a dynamic check: anytime you store into a non-primitive array Java does an instanceOf-like check to make sure it's legit. In my sample code, the assignment is checked in "objects[0] = new Integer(42)". Failure will give an ArrayStoreException. That means every array store has to do expensive extra work and, more importantly to this article, at time of construction an array must be associated with a dynamic type tag. Like all pacts with Satan the negative consequences of this deal weren't obvious until much later.</p><h3>History, Part the Second</h3><p>With Java 1.5, Java programmers were finally able to move to the forefront of 1970's era technology with parameteric polymorphism, er, excuse me, generics. But see, there was this huge collection library and 3rd party libraries that extended it and...how the hell are we going to <a href="http://lamp.epfl.ch/pizza/gj/Documents/gj-oopsla.pdf">make the future safe for the past</a>?</p><p>The decision was to compile with "type erasure" - dynamic type tags no longer carry all the information in the static type. Now, there's nothing evil about type erasure. Some of my best friends are languages that compile with even more type erasure than Java. It's just that Java has "instanceof" and other similar things that don't play well with it. Relevant to this article we have the aforementioned pact with Satan where arrays need a dynamic type tag or Bad Things Happen&trade;.</p><p>Concerned with Bad Things, the Java designers started spackling over some holes. The upshot: this code fails to compile with the static error "Cannot create a generic array of T"</p><pre> public &lt;T&gt; T[] arrayOfT(T a1, T a2, T a3) { return new T[]{a1, a2, a3}; // bzzt static error! } </pre><p>Same deal with this, "Cannot create a generic array of Pair&lt;A, B&gt;".</p><pre> public &lt;A, B&gt; Pair&lt;A, B&gt;[] twoPairs( A a1, B b1, A a2, B b2) { return new Pair&lt;A, B&gt;[]{pair(a1, b2), pair(a2, b2)}; } </pre><p>But this is a very shallow spackling indeed. Keep reading.</p><h3>History, Part the Third</h3><p>Also with Java 1.5, the designers added varargs: the ability to pass an arbitray number of arguments of the same static type to a method. So convenient! But, uh, apparently the varargs guys and the generics guys didn't talk to each other until late in the game. See, the varargs folks decided to implement varargs as sugar over arrays rather than something that's actually type safe.</p><pre> public void whatClassIsIt(Object... things) { System.out.println(things.getClass()); } whatClassIsIt(pair(1, "one"), pair(2, "two"), pair(3, "three")); </pre><p>There's our old buddy erasure. The above code prints "class [LPair;" - array of Pairs. But note it's not dynamically known to be an array of Pair&lt;Integer, String&gt;.</p><h3>Making Bad Things Happen&trade;</h3><p>Now we've got enough knowledge to screw things up nicely. We've got statically unsafe arrays that depend on runtime type tags to provide a reasonably early error. And we have an ability to create arrays that don't carry all the information needed to do the runtime check.</p><pre> public static &lt;A, B&gt; Pair&lt;A, B&gt;[] pairs( Pair&lt;A, B&gt;... pairs) { return pairs; } Pair&lt;Integer, String&gt;[] pairs = pairs(pair(1, "one"), pair(2, "two"), pair(3, "three")); Object[] objects = pairs; // no error in this assignment, statically or dynamically! objects[0] = pair('c', 2.0); // arrgg ClassCastException System.out.println(pairs[0].getSecond().length()); </pre><p>The code doesn't have an error at the point of assigment. The error does finally occur later when I attempt to use a value. That means that in real code the error could be very far from the real cause. The only diagnostic that prevents this Bad Thing&trade; from happening is the warning on the first line; our lovely "Type safety : A generic array of T is created for a varargs parameter".[3] And that warning has to occur in a place that doesn't know whether the use of the array is actually bad or not. Hence, our code's beauty (or at least what passes for beauty in Java) is shot.</p><h3>Conclusion</h3><p>Ah, if only. If only Java didn't have covariant arrays. Or, since it does, if only the designers had used something other than arrays to carry varargs. Woulda coulda shoulda. This article shows just how hard it can be to extend an existing language without shooting yourself and your users in the ass. Runtime checked array assignment + varargs in arrays + type erasure == pain.</p><p>Now you know why Scala's arrays are invariant.</p><h4>Footnotes</h4><ol><li>Q: What makes Java so manly? A: It forces every programmer to grow a Pair.</li><li>Please don't rag me about how I wrote 17 lines of code to avoid writing 3 - presumably I'd reuse this infrastructure a lot.</li><li>For an even nastier problem with the erasure scheme and arrays see this <a href="http://forums.sun.com/thread.jspa?threadID=5319823">example of trying to write higher order code using generics and varargs</a>.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-2933252320565217229Tue, 25 May 2010 09:53:51 GMTTypes à la Charthttp://james-iry.blogspot.com/2010/05/types-la-chart.html<div class="post-body entry-content"> <p>I recently saw a debate on whether a certain language had a "weak" or "strong" type system. Naturally the debate ended in flames. As far as I could tell neither side knew what the other was talking about which isn't surprising as I've almost as many definitions of "weak" and "strong" typing as there are developers who use the words. Frankly the words have too many meanings to be useful.</p><p>If the first part of the problem is ambiguity then it's still only a start of the problems. Even if you get something crisp like "static" vs "dynamic" you've still managed to condense out a lot of useful information. Agda and Java live in incredibly different spaces even though they are both statically typed. Or how 'bout the world of difference between E and Ruby, two dynamically typed languages?</p><p>In this article I'm going to try to convince you that the space is far more complicated than the typical debates would allow. The design space is very highly dimensioned, perhaps infinitely so, but I'll boil it down to 2 important ones which none-the-less already show considerably more sophistication. Here's the set up: what can you know statically about an expression, like f(x), without dynamically executing/evaluating it and if all you know is static types? The two dimensions I use are 1) How "safe" is it - are there back channels that might violate abstractions? And 2) How rich and sophisticated can the static type be?</p><p>One big, fat caveat: this is almost certainly wrong in many details. I've listed a bunch of languages. There just aren't enough hours in a lifetime to understand in detail exactly what each one can promise and how one type logic does or does not contain another. Naturally if you are an expert on one of these languages and I got something a bit wrong you will conclude I'm an idiot even if everything else is 100% right. Such is a the price of writing for a technical audience. In spite of any such errors I contend that even if some details are wrong the overall picture should be reasonably accurate and should convey my central thesis that this space is too complicated for binary distinctions.</p><p>Click to zoom</p><p><a href="http://www.pogofish.com/types.png"><img height="400" width="400" src="http://www.pogofish.com/types.png" alt="Chart of Static Type Properties" /></a></p><h3>Power</h3><p>The horizontal axis is "power" or "sophistication" of the static type system's underlying logic. For this analysis I focused only on the ability of the static type system to describe actual data structures and computations. While it's interesting that C++, GHC Haskell, and Scala all have Turing complete type systems, none of their type systems seem to have the same ability to usefully describe computation as Agda's. Of course, I could be wrong. Feel free to bump the asterisked languages to the right accordingly.</p><p>The horizontal axis can also be read backwards in terms of type system simplicity.</p><p>The far left requires some explanation. The languages in this column tend to be dynamically typed which at first seems like a challenge for thinking about them in static terms. But there's a nice theoretical framework where you think of them as having one static type. A valid Scheme expression has "the Scheme type" and an invalid one does not. Different people who mean to emphasize different things will call these languages "untyped," "unityped", or "dynamically typed." For my purpose "unityped" gives the right connotation. This column corresponds to the untyped lambda calculus.</p><p>Moving to the right there are simple first order type systems. Such type systems are a bit richer than the unityped ones, but don't have any form of type abstraction: they don't allow a user to define types that will be parameterized by other types. This column corresponds to the simply typed lambda calculus.</p><p>Further right are second order type systems. These allow type abstraction qualified with "forall", and correspond to System F. In between first and second order is HM (Hindley-Milner) which allows full type inference but at the price of not allowing some second order things like higher rank polymorphism. Moving to the right are higher kinded languages, corresponding to System Fn up through System Fω</p><p>At the far right are dependently typed languages: languages that allows types to be parameterized by values[1][2]</p><h3>Safety</h3><p>The vertical axis is various kinds of "safety." For safety analysis I'm completely ignoring foreign function interfaces and the like. FFIs are useful and pragmatic but also make otherwise pretty safe languages no more safe than Assembly.[3] </p><p>You can also read this axis going the opposite direction in terms of a kind of operational power - Assembly isn't safe, but it lets you make the machine dance.</p><p>At the bottom end are memory unsafe languages. An expression like f(x) might very well totally hose memory. </p><p>A bit up from that is memory safe languages. Such languages have a few built in abstractions (like you can't write past the end of an array) that f(x) can't possibly violate. A couple of steps up is abstraction safety. That means that one developer's abstractions can be protected from another developer in some fashion. For instance, private variables can only be accessed through functions defined in a scope with access to the variable. Many popular "dynamic languages" are not abstraction safe because their powerful dynamic meta-programming facilities allow abstractions to be smashed. In between I've listed a few JVM languages as being configurably safe: the Java standard includes some pretty powerful reflection capabilities that allow many developer written abstractions to be violated, but the JVM standard includes a security mechanism that allows that to be turned off. Move these points up and down accordingly.</p><p>At the next level are capability safe languages. In most languages, any function call can have all manner of harmful side effects including formatting your hard drive or launching missiles. But at the capability safe level the expression cannot possibly do such crazy things unless granted that power in the form of an "object" that represents the capability to do so. While you cannot statically know whether f(x) has received a missile launching capability, you can statically know that if f(x) lives in a program that has not been granted missile launching capability then it will not launch missiles.</p><p>The pure level is for languages with a strict, statically enforced wall between side effecting expressions and non-side effecting ones. Examining the static type is sufficient to know the difference. Finally, the "pure &amp; total" level is for languages where the expression f(x) is either guaranteed to terminate normally (no infinite loops, no exceptions)[4] or where static types indicate possible non-totality.</p><h3>Random Notes</h3><p>In many languages safety properties are enforced dynamically so a knee jerk reaction is "why is this stuff in a discussion about static properties." But remember the point was what I can statically know about an arbitrary expression f(x) without evaluating it. In Ruby I know with absolute conviction that, barring implementation bugs and foreign function interfacing, any arbitrary f(x) absolutely does not write past the end of an array. In C I don't know that.</p><p>It might seem odd to put Java further to the right than Haskell '98 but it's true. Haskell'98 has a slightly extended Hindley-Milner type system while Java has a kind of warped System F&lt;: (use nesting to create rank-n polymorphism). Of course, a chart that ranked type systems on "bang for the buck", versatility vs ease of use, would rank things quite differently.</p><p>C# gets stuffed into the unsafe row because of the "unsafe" keyword and associated pointer machinery. But if you want to think of these as a kind of inline foreign function mechanism feel free to bump C# skyward.</p><p>I've marked Haskell '98 as pure but GHC Haskell as not pure due to the unsafe* family of functions. If you disable unsafe* then jump GHC Haskell up accordingly.</p><p>Assembly is problematic. In a certain weird sense some Assemblies are "memory safe" because they do protect their language abstractions by virtue of having almost none.</p><h3>Conclusion</h3><p>"Weak" and "strong" mean so many things that they mean nothing. "Static" and "dynamic" mean something but don't really convey many differences. This chart is not the end-all-be-all of language categorization, but it is the beginning of the start of an inkling of a glimpse at how rich the space really is. Hopefully this will make static typing advocates understand that dynamic typing doesn't necessarily mean total insanity while dynamic typing advocates can begin to see the range of sophisticated expression afforded by many modern statically typed languages. I highly recommend <a href="http://www.pphsg.org/cdsmith/types.html">"What To Know Before Debating Type Systems"</a> for more on the subject.</p><h4>Footnotes</h4><ol><li>Type theory folk will note that I've pretty casually collapsed two legs of the "lambda cube" into one dimension of power. That's not very valid but that seems to work out in practice. It's hard to find a sophisticated dependently typed language (types indexed by values) that isn't also very rich when it comes to types indexed by types.</li><li>C++'s ability to parameterize templates by a few forms of literal shouldn't be confused with dependent typing.</li><li>Technically there's no such thing as "Assembly;" it's a family of vaguely related languages with semantics tied extremely closely to underlying machines.</li><li>A total language can of course crash due to resource exhaustion or hardware failure. There's no way to prevent that.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-5671321100639097484Wed, 05 May 2010 09:55:07 GMTC Is Not Assemblyhttp://james-iry.blogspot.com/2010/04/c-is-not-assembly.html<div class="post-body entry-content"> <p>In my <a href="http://james-iry.blogspot.com/2010/04/good-math-bad-pointer-math.html">last article</a> I dug down into some of the joys of undefined pointer behavior in C. But I did it in the context of an architecture that most developers aren't too likely to ever see again, the Intel 8086. I wanted to show that this stuff matters even with more mainstream architectures because compilers are free to do a lot of non-obvious things. C is not assembly.</p><p>The <a href="http://www.us-cert.gov/aboutus.html">United States Computer Emergency Readiness Team (US-CERT)</a> "<em>is charged with providing response support and defense against cyber attacks for the Federal Civil Executive Branch (.gov) and information sharing and collaboration with state and local government, industry and international partners.</em>"</p><p>With a U.S. Department of Homeland Security badge on their page you know they're serious. When you overflow your buffers the terrorists win. So you'd think they'd take C seriously.</p><p>I found a real WTF gem caused by programmers treating C like assembly. <a href="http://www.kb.cert.org/vuls/id/162289">Vulnerability Note VU#162289</a></p><p><em>"In the C language, given the following types:</em></p><pre> char *buf; int len; </pre><p><em>some C compilers will assume that buf+len &gt;= buf. As a result, code that performs wrapping checks similar to the following:</em></p><pre>len = 1&lt;&lt;30; [...] if(buf+len &lt; buf) /* wrap check */ [...overflow occurred...] </pre><p><em>are optimized out by these compilers; no object code to perform the check will appear in the resulting executable program. In the case where the wrap test expression is optimized out, a subsequent manipulation of len could cause an overflow. As a result, applications that perform such checks may be vulnerable to buffer overflows."</em></p><p>The advisory is careful to admit that compilers are free to do just that. Here's why: greatly simplified, the C standard says a pointer must point at a valid object or just past the end. Any pointer arithmetic that might cause a pointer to step outside those bounds yields undefined behavior. So by definition either buf + len >= buf or the program is free to do anything up to and including launching shoulder mounted kitten missiles at Capitol Hill.</p><p>Still, there is a WTF to lay on the compiler writer here. If a programmer writes an "if" test then presumably he or she had some reason to believe that sometimes the test might be true. Before optimizing away the conditional the compiler really should have issued a warning.</p><p>In order for this code to have any hope of working a few assumptions must hold: sizeof(int) &lt;= sizeof(char *), overflow must happen "silently", etc. But there's another major assumption here: the buffer pointed to by buf must be located at the end of its address space. With a check like this, if there are any objects located higher in the same overflow segment then those objects are getting some kitten missiles. So another WTF is a developer making an assumption about how a compiler works in the face of undefined code.</p><p>Now, there are a few scenarios where all these assumptions might be justified. For instance, if you're targeting some special purpose embedded device then memory layout might be well understood. In such situations, the optimization performed by the compiler might be shocking indeed, even if technically permissible.</p><p>The problem is that the developer is thinking at the assembly code level but the C standard says the compiler doesn't have to "think" the same way. In assembly the distance between what you write and the object code generated is pretty small (barring some kind of complicated macro magic). In C the distance is quite a bit larger.</p><p>Repeat after me, C is not assembly.</p><h3>The Solution?</h3><p>In my mind the real WTF is the US-CERT's proposed solution.</p><p>"<em>Cast objects of type char* to uintptr_t before comparison. The faulty wrapping check listed above would be written</em></p><pre>#include &lt;stdint.h&gt; [...] if((uintptr_t)buf+len &lt; (uintptr_t)buf) [...] </pre><p><em>Alternatively, developers can use size_t on platforms that do not provide the uintptr_t type.</em>"</p><p>Yes, that's right, the proposed solution is to continue with the same undefined behavior of adding arbitrary ints to pointers but to also add some casts with undefined semantics. Well...WTF? Didn't we get here because the compiler was free to do anything in the face of undefined semantics?</p><p>C is not assembly. Not even when you do some casting. A compiler would be totally free to optimize away even this check by "seeing through" the cast as being the same thing as the uncasted code. Or the cast to an integral type could do anything at all like invert the bits. Or the overflow could stop the program cold. Or shoulder mounted kitten missiles. Whatever.</p><h3>The Real Solution</h3><p>The best solution is "don't do that." Give the buffer a size and compare len to that size. The assumption that the buffer is the last object in its address space is just paying with fire. But I'll assume they really did need to use memory at the end of the address space. Fine. What's the solution?</p><p>It's this: you can't say "end of address space" portably in C so tread carefully when you use undefined behavior. Minimize the amount of undefined behavior you depend on. Pull the non-portable stuff out into separate compilation units and document that the assumptions. Maybe even write the non-portable stuff in assembly where you know what's going to happen, but at least write a unit test that ensures that the compiler behaves the way you think it does. And always, always remember that C is not assembly.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-4479598368262226583Mon, 12 Apr 2010 16:22:56 GMTGood Math, Bad Pointer Mathhttp://james-iry.blogspot.com/2010/04/good-math-bad-pointer-math.html<div class="post-body entry-content"> <p>Heard about the new C bandwagon? According to April's Tiobe report C has eclipsed Java as the number 1 most popular language by a whopping 0.007%. So while all the rest of you Java suckers have just lost all professional relevance, I'm OK. I cut my professional teeth on C.</p><p>Sometime ago on Twitter I gave in-jest advice to new C programmers: all your code and data are in one giant mutable array indexed by pointers, good luck. David MacIver (http://twitter.com/drmaciver) responded that the C standard didn't say that. Whereupon I swore a mighty blood oath of vengeance upon him for ruining my joke with facts.</p><p>Vengeance is hard so I'll get started later, maybe tomorrow or next week. Right now I'll write a bit about the facts behind pointers. One of the trickiest aspects of writing portable C programs is that a great deal of the specification says that the behavior of certain constructs is undefined. Old C hands sometimes point out some undefined behavior in a C newbie's program whereupon the newbie fires up gcc or msvc or whatever and shows that the code behaves exactly as it "should." Who's right? What does it mean to be undefined?</p><p>C was designed to be both 1) portable across a wide range of platforms and 2) work at a very low level. Those two goals are pretty much in opposition - low level details can vary widely across platforms. Don't think MacIntel vs Wintel, think IBM System/360 vs DEC PDP-11 vs Motorola 68x vs x86. And so the C standard says that certain things are syntactically valid but the behavior can depend on compiler, options, hardware, and the day of the week. For example according to the standard dereferencing a null pointer is undefined. It may segfault; silently appear to succeed but actually return garbage; update your Facebook page with your masturbation frequency; or none of the above. It's all good.</p><p>Why should you care about portability if you aren't planning to port? Well, for one thing, undefined behavior can vary across versions of the same compiler or even different compiler options so planning not to port is not a realistic option. For another, even when you think you know what is going to happen on your target platform you might be wrong in some cases - some <a href="http://blog.cr0.org/2009/08/linux-null-pointer-dereference-due-to.html">security critical cases</a>, even. And finally, if you're good at hacking C there's a good chance you'll someday be called upon to write code for some device with a very different architecture. You might as well get into good habits now.</p><h3>NULLity</h3><p>While I was unable to find an example of Facebook updating I was able to find examples of two other distinct null pointer behaviors. The following program derefs a null:</p><pre> #include &lt;stdio.h&gt; int main() { puts(NULL); }</pre><p>On one common platform the program caused a segfault as you might expect. But on a second, mystery platform (to be revealed shortly) the program ran to "successful" completion with the output "&#9553;&#9658;&#9572;". Yup, garbage, not crash.</p><p>Here's another one. The standard says that a null pointer can textually be written with a 0 in cases where the compiler can statically prove that you want a pointer, but it does not guarantee that a null pointer will in fact consist of a bit pattern of all zeros. So if you declare</p><pre> const int n = 0; </pre><p>Then the following will always be true</p><pre> NULL == 0; n == 0;</pre><p>But the following might or might not be true</p><pre> NULL == (void *)n; (int)NULL == n; (int)NULL == 0;</pre><p>Unfortunately, it's a bit difficult to find a platform where the null pointer is not a pattern of 0 bits so this non-portable code is surprisingly portable, but see the <a href="http://c-faq.com/null/machexamp.html">comp.lang.c FAQ</a>.</p><h3>Good Math, Bad Pointer Math</h3><p>Here is an example of some code that does behave very, very differently.</p><pre> #include &lt;stdio.h&gt; int main() { const unsigned long base = 0xfffc; /* table header */ printf("\t\tlong\t\tchar *\n"); /* size of data types */ printf("size\t\t%d\t\t%d\n", sizeof(base), sizeof((char *)base)); /* the core table */ for (unsigned long i = 0; i &lt; 8; i++) { unsigned long n = base + i; char *p = (char *)base + i; printf("%08lx + %ld\t%08lx\t%08lx\n", base, i, n, p); } }</pre><p>C aficionados will be horrified that I'm using non-portable %08lx instead of portable %p to output a pointer, but using %p would reveal too much. On the first, common platform the result was</p><pre> long char * size 4 4 0000fffc + 0 0000fffc 0000fffc 0000fffc + 1 0000fffd 0000fffd 0000fffc + 2 0000fffe 0000fffe 0000fffc + 3 0000ffff 0000ffff 0000fffc + 4 00010000 00010000 0000fffc + 5 00010001 00010001 0000fffc + 6 00010002 00010002 0000fffc + 7 00010003 00010003 </pre><p>So both longs and pointers are 4 bytes (32 bits) and both long and pointer math are pretty ordinary.</p><p>Now, on the second, mystery platform with one set of compiler options I get this</p><pre> long char * size 4 4 0000fffc + 0 0000fffc 0000fffc 0000fffc + 1 0000fffd 0000fffd 0000fffc + 2 0000fffe 0000fffe 0000fffc + 3 0000ffff 0000ffff 0000fffc + 4 00010000 10000000 0000fffc + 5 00010001 10000001 0000fffc + 6 00010002 10000002 0000fffc + 7 00010003 10000003</pre><p>Still 32 bits but pointer math is, um, strange. 3 nibbles (12 bits) have been skipped. It gets perhaps weirder with a different set of compiler options on the mystery platform.</p><pre> long char * size 4 4 0000fffc + 0 0000fffc 0000fffc 0000fffc + 1 0000fffd 0000fffd 0000fffc + 2 0000fffe 0000fffe 0000fffc + 3 0000ffff 0000ffff 0000fffc + 4 00010000 00000000 0000fffc + 5 00010001 00000001 0000fffc + 6 00010002 00000002 0000fffc + 7 00010003 00000003</pre><p>So now my 32 bit pointer is wrapping around at a mere 16 bits. And that's a clue to our mystery: we're kicking it old skool.</p><h3>Mystery Revealed</h3><p>For my first, common platform I'm using "gcc -std=c99" on 64 bit Ubuntu x86 but compiling using "-m32" to force 32 bit compilation. The second, mystery platform is the Watcom C compiler in C99 mode (wcc -za99) on the <a href="http://www.freedos.org/">FreeDOS</a> MS-DOS compatible operating system running in a <a href="http://www.virtualbox.org/">VirtualBox</a> virtual machine.</p><p>Using DOS keeps the machine in "real mode" - which basically means the chip (or virtual chip) behaves as an 8086[1]. Now, the 8086 is a peculiar beast. It's a 16 bit chip - registers are all 16 bits and most instructions work with 16 bit words - but it has a 20 bit memory address width. How do you squeeze 20 bits into a 16 bit word? Well, internally the 8086 uses a highly optimized form of data compression based on lzw that...</p><p>No, wait, that's not right. The answer is that the 8086 un-squeezes 20 bits into 32 bits - two 16 bit words instead of one. But here's where it gets weird. Instead of saying "we'll allow 32 bits of addressing but make it a hardware fault if anything above the low 20 bits is set" the 8086 designers said, well, this:</p><pre> XXXX0 the "segment" shifted left by 4 bits + XXXX the "offset" ------- XXXXX final 20 bit address</pre><p>A segment register is shifted left 4 bits and then added to an offset register to get an actual address. So now it's clear why 0x0000fffc + 4 = 0x10000000 with one set of compiler flags: the segment has been incremented by 1 in the bits that count[2]. But why is the result 0x00000000 with a different set of compiler flags?</p><p>In order to get the 0x0000fffc + 4 = 0x10000000 behavior the machine code emitted by the compiler has to keep track of overflow - so every pointer add is not just a register add, but a register add plus an overflow check and perhaps a second register update. There may also be a move of two registers to/from memory instead of just one. That could be quite a performance hit on a lowly 4.77MHz machine (the original clock speed on the IBM PC).</p><p>So, as an optimization, most C compilers for 8086 real mode allow you to say that no structure is ever going to be larger than 2^16 bytes (64K) so that the emitted code doesn't always need to do all those gymnastics. The upshot: pointers silently overflow at 16 bits. 0x0000fffc + 4 = 0x00000000 and 0x1000fffc + 4 = 0x10000000.</p><p>If you're playing along at home the first result was achieved using the -mh ("huge" memory model) compiler flag and the second one was with -ml ("large" memory model). Both were linked with "wlink name test file test.obj". For more about the gory details of real mode memory models see <a href="http://www.users.pjwstk.edu.pl/~jms/qnx/help/watcom/compiler16/wmodels.html">16 bit memory models</a>[3].</p><h3>The Standard</h3><p>Back to earth now. The C standard is fine with all of the above behavior. As a gross oversimplification the standard says that</p><ol><li>A pointer must be NULL or must point at something valid or must point just past the end of something valid. Pointer math that would result in a pointer that is outside of those bounds is undefined. It might work like any of my examples, or it might put a phone order in for pizza.</li><li>A dereference of a NULL pointer or a pointer does not point at something valid, even if the pointer is only pointing one step past the end of something valid, is also undefined. You pray for a crash.</li></ol><p>In short, the C standard doesn't pretend that your program is one giant array. It pretends each individual allocation creates island unto itself (and may be part of a larger island or composed of smaller islands). The C standard is pretty quiet about what can happen between those islands. It's just that in modern platforms memory tends to behave like one giant flat mutable array so it's easy to assume that that's part of the C model.</p><p>One way to look at it is that the hypothetical C standard memory model treats pointers somewhat the way higher level languages treat references and array indexing, except that those higher level languages define bad behavior as either being statically impossible (no pointer math) or dynamically checked (null pointer exceptions and array index checks) where the C standard just says "undefined."</p><h3>In Defense of Undefined</h3><p>When people explain C's choices it's common to say that it's about implementation efficiency. Indeed that's a huge part of it. But remember that C was specifically designed to allow very low level access; to allow much (though by no means all) of what assembly allows. So C's ability to write to arbitrary bits of memory allows useful things like talking to memory mapped devices. That kind of code is not portable, but there are important bits of some kinds of programs that aren't going to be portable no matter what you do. The trick is to keep those bits walled off and as small as possible.</p><p>I'll see all you former Java programmers in the unemployment line as I drive by in my new car with the custom plates "PTRMATH%saccount=34389983829 password=h3ll0k1ttySegmentation fault.</p><h4>Footnotes</h4><ol><li>Yes, I'm keeping a "virtual" machine in "real" mode. I'm just as confused as you are.</li><li>Actually, one consequence of this funky scheme is that the same location in memory can be addressed multiple ways. 0x00120 + 0x0003 = 0x00123 and 0x00100 + 0x00023 = 0x00123. It's just that in the huge memory model the compiler's emitted code for pointer math keeps things canonical by using only the highest 4 bits of the segment.</li><li>For even more weirdness associated with this segmentation stuff read a bit about the <a href="http://www.openwatcom.org/index.php/A20_Line">A20 Line</a>, something that causes heartburn to bootloader writers to this day.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-2828860865593933782Thu, 08 Apr 2010 09:33:16 GMTRobert Fischer Finally Admits that Scala is Functionalhttp://james-iry.blogspot.com/2010/03/robert-fischer-finally-admits-that.html<div class="post-body entry-content"> <p>Robert Fischer <a href="http://enfranchisedmind.com/blog/posts/post-functional-scala/">says</a></p><p><em>If your language doesn&#8217;t lead people to re-discover point free programming at least in the small, then the language really isn&#8217;t taking function manipulation and functional language type conceptions seriously.</em></p><p>Thus Fischer has finally recognized that Scala is functional.</p><pre> val double : Int => Int = 2 * // point free function definition List(1,2,3) map double /* removes one point from List(1,2,3) map {x => double(x)} */ </pre><p>Thank you Robert for finally getting to the truth.</p><h3>Oh, the Underscore</h3><p>Now, Robert may object because I didn't combine map and double into one super function, mapDouble. Okay</p><pre> val mapDouble : List[Int] => List[Int] = _ map double mapDouble(List(1,2,3)) </pre><p>Crap, there's an underscore. Is it a "point"? Good question. And the answer is no. In fact, what it is is a hole in the expression. Oleg Kisiliov, somebody who actually knows about functional programming, says they're <a href="http://okmij.org/ftp/gengo/#Scala-trace">related to delimited continuations</a> and what could be more functional than <a href="lamp.epfl.ch/~rompf/continuations-icfp09.pdf">delimited continuations</a>?</p><h3>Also, Erlang Remains Unfunctional</h3><p>For some reason Fischer really hates on Erlang. Point-free is definitely not Erlang's strong suit at all due to it's ability to overload on arity. I don't know why the Erlangers don't go picket his blog.</p><h4>Post Script on Converses</h4><p>David MacIver points out (correctly) that my post assumes that a statement implies its converse. To that I say "pffft, write your own damn parody, David!"</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-6031173593466159927Mon, 08 Mar 2010 10:23:56 GMTGetting to the Bottom of Nothing At Allhttp://james-iry.blogspot.com/2009/08/getting-to-bottom-of-nothing-at-all.html<div class="post-body entry-content"> <p>Except when I'm being a total smart ass or self indulgent emo telling the Ruby community to f' off, most of what I write talks about practical stuff, often with a small dose of theory. But this time it's about theory with a small dose of practice. <a href="http://james-iry.blogspot.com/2009/07/void-vs-unit.html">Last time</a> I talked about the way programmers in popular C derived languages C++, Java, and C# think of a void function as being one that returns nothing. I contrasted this with the functional view that such functions return something, they just return the unit value (), something that doesn't carry any information. To expand on this I want to talk about actually returning nothing.</p><p>In any Turing complete language it is possible to write a function that really truly never returns anything, not even a made-up singleton value like (). In C style languages the simplest such function might contain an infinite loop like<pre name="code" class="c">while(true);</pre>For my purposes a better way to explore the idea is with infinite recursion. Pretend your favorite C style language does tail call optimization[1] so we can ignore stack overflow issues and examine the following<pre name="code" class="c">X foo(){ return foo(); }</pre></p><p>The question is what should the X be? And the answer is that if you plug that line of code into C, C++, C#, or Java then X can be just about any type you want[2].</p><p>If that doesn't give you pause then consider this: your type system appears to be lying to you. The function is promising an X but has no hope of delivering. And its quite happy to promise you any X. If you write "String foo(){ return foo(); }" then your type system considers the expression "foo();" to have type String so that you can write "String myFoo = foo();". But guess what, foo doesn't return a String and myFoo will never get a String. Is the type system lying?</p><p>The answer is no, of course not. It's just proving something a bit weaker than you might think at first. Instead of proving that "foo() will compute something that is a String" it's proving that "foo() won't compute something that isn't a String." If your "Law of the Excluded Middle" sense just tweaked, then you're on the right page. You'd think there are only two cases to consider: either the function definitely computes a String or it definitely doesn't. But there's this third case where it doesn't compute anything, and since the type checker can't reliably detect non-termination (c.f. Turing Halting Problem) it has to do something a bit odd. First, it assumes that the foo() declaration is correct and does return a String. Then it goes looking for a contradiction such as returning an int. Inside the function the compiler sees that the return is the result of the expression foo(). But the compiler has already assumed that foo() returns a String. There's no contradiction between declared and result so all is happy. The word "tautology" should come to mind right about now. By assuming X is true the type checker has proved that X is true. This weakening is a practical consequence of the Turing Halting Problem and there are at least two good ways to think about it. But first some more examples of the phenomenon.<h3>Exceptions and Exits and ...</h3>I very casually suggested that we ignore stack overflow issues in the infinite recursion above. But exceptions are an important part of this story because they are, in some interesting way, related to non-termination. Consider this function (or its equivalent in your favorite language)<pre name="code" class="java">X foo(){ throw new RuntimeException(); }</pre>Once again, X can be any type. And once again foo() does not in fact compute an X.</p><p>Clearly these two definitions of foo are different things. Non-termination means we're stuck whereas an exception means we might be able to recover and try something else, or at least notify that there's a problem and shutdown cleanly. None-the-less they both behave similarly. First, they hijack the normal flow of control in such a way that it never returns back to the caller. And second they are both examples of a kind of weakening in the type system since any type can be substituted for X. Formally they are both examples of functions that diverge (functions that return normally are said to converge).<h3>The Type</h3>Here's a third example of a diverging function in Java. Translate as necessary for your language (in Java System.exit(n) stops the program immediately and returns n to the calling process).<pre name="code" class="java">X foo(){ System.exit(1); return null; }</pre></p><p>Yet another case where foo() won't compute anything, it diverges. In fact, the return statement after the exit call is dead code. However, this is example is slightly different than the other examples because we had to create the dead code for the compiler to be happy[3] and, in fact for a different "X" like int the return value might have to be changed. That bit of dead code is closely related to the heart of this article.</p><p>Java can detect some kinds of dead code. If you write "X foo() { throw new RuntimeException(); return null; };" then Java recognizes that the return is unreachable and complains. In my previous example with System.exit(1) Java didn't recognize that the call to exit() will never return so it required a following return statement. Obviously "throw" is a keyword and can get special attention that a mere library function can't but it would be useful to be able to let Java know that, like throw, exit() diverges.</p><p>One of the best ways to tell a compiler how a function behaves is by using types and, in type theory, there's a type that expresses just what we need. The type is called bottom (often written &#8869;), and while there are different ways to look at the bottom type I'm going to go with a subtyping based view that should be accessible to C++, C#, and Java programmers.</p><p>If a language has subtyping and a function says it will return a type "X" then the function is allowed to return a "Y" instead as long as Y is a subtype of X. In my example of a function that just throws an exception the return type could be anything at all. So if we wanted System.exit(1) to indicate that it diverges the same way throw does, then its return type should be a subtype of all types. And indeed, that's exactly what bottom is.[4] bottom is a subtype of String, and int, and File, and List&lt;Foo&gt;, and every other type. Usefully, conventional type hierarchies are written with supertypes above subtypes, which makes a convenient mnemonic for "bottom" which needs to go below everything on such a hierarchy.</p><p>Now, if you're used to OO thinking then you expect a value with a certain subtype to in some sense be substitutable everywhere that a supertype is expected. But how can any one object behave like a String, an int, a File, etc? Remember that bottom indicates divergence: an expression with type bottom can never compute a value. So if exit()'s return type was bottom it would be totally safe to write "String foo() { return System.exit(1); }" while another bit of code could have "int bar() {return System.exit(1); }."<h3>Making it Useful, A Divergence to Scala Divergence</h3>Occasionally it might be useful to indicate that a function diverges. Examples are functions like System.exit(1), or functions that always throw an exception, perhaps after logging something or doing some useful calculation to create the exception. But interestingly out of all the statically typed languages that have any following outside of pure research only Scala has the bottom type, which it calls Nothing. The reason Scala has a bottom type is tied to its ability to express variance in type parameters.</p><p>For some reason a lot of programmers run screaming into the night when you say "covariance" or "contravariance." It's silly. I won't get into all the details of variance, but I will say that in Scala the declaration "class List[+T] {...}" means that List[Subtype] is a subtype of List[Supertype]. No screaming necessary. And List[+T] brings me to one extremely practical use of bottom - what type should an empty List have? </p><p>Well, an empty List can have type List[String] or List[Foo] or List[int]. T can be whatever. And what's a subtype of whatever for all values of whatever? You guessed it: bottom (Nothing). Indeed, Scala has one constant called Nil whose type is List[Nothing]: a subtype of List[String], List[int], and List[whatever]. It all ties up in a bow when you consider that a List[T] has a method called head which returns the first element of the list as type T. But an emtpy list has no first value, it must throw an exception. And sure enough, head in List[Nothing] has type Nothing.</p><p>C# 4.0 is supposed to be getting definition site variance similar to Scala's but using the clever mnemonic keywords "in" and "out". I haven't heard anything yet on whether it will also add a bottom type, but it would make a lot of sense.</p><p>Java has usage site variance using wildcards. You can say "List&lt;? extends Supertype&gt; x" to indicate that x can have a List&lt;Supertype&gt; or a List&lt;Subtype&gt;. The bottom type would be useful in Java, too, although not as compelling since wildcards are so verbose people rarely use them even when they would make sense. Plus, Java folk tend to mutate everything and List[Nothing] in part makes sense in Scala because Scala Lists are immutable.</p><p>C++ does not have any simple way to express this kind of variance so the bottom type is even less compelling in C++ than it is in Java.<h3>Back On Track</h3>Haskell and languages in the ML family don't have an explicit bottom type. Their type systems don't have subtyping so adding bottom as a subtype would confuse things. None-the-less, they do have a nice way to express bottom that can be teleported back to Java, C#, and C++ (but not C). Recall that bottom is a subtype of all types. Another way of saying that is that if a function returns type bottom then for all types A, the function returns something compatible with A so why not express that directly? In Haskell the "error" function takes a string and throws an exception. <pre name="code" class="haskell">Prelude&gt; :type error error :: [Char] -&gt; a </pre> In Haskell, a lower case identifier in type position is always a type parameter, and [Char] means "list of Char", aka String. So for all types a, if "error" doesn't diverge then it will take a String and return an a. That can pretty much be expressed directly in Java<pre name="code" class="java">public static &lt;A&gt; A error(String message) { throw new RuntimeException(message); }</pre></p><p>or C++ <pre name="code" class="c++">class message_exception : public std::exception { public: explicit message_exception(const std::string& message) : message_(message) {} virtual ~message_exception() throw() {};</pre></p><p> virtual const char * what() const throw() { return message_.c_str(); };</p><p> private: const std::string message_; }; template &lt;typename A&gt; A error(const std::string message) {throw message_exception(message); }</p><p>And for either language, usage would be something like<pre name="code" class="c++">int divide(int x, int y) { if (y == 0) { return error&lt;int&gt;("divide by zero"); // drop the "&lt;int&gt;" in Java } else { return x / y; } }</pre></p><p>Haskell also has a function called "undefined" that simply throws an exception with the message "undefined." It's useful when you want get started writing some code without fully specifying it.<pre name="code" class="haskell">Prelude&gt; :type undefined undefined :: a</pre></p><p>The function isn't as interesting as the type, which promises that for any type a "undefined" can compute an a or diverge. Since "undefined" can't possible just produce a value of any arbitrary type, it has no choice but to diverge. The same idea can be added to Java<pre name="code" class="java">public static &lt;A&gt; A undefined() {return error("undefined"); }</pre> or C++<pre name="code" class="c++">template &lt;typename A&gt; A undefined() { return error&lt;A&gt;("undefined"); }</pre></p><p>In either language it might be used as<pre name="code" class="c++">string IDontKnowHowToComputeThis(int input) { return undefined&lt;string&gt;(); // again, make appropriate changes for Java } </pre></p><p>Given the requirement to write the "return" keyword in C#, Java, and C++ I'm not sure how practical something like a generified error function really is as compared to having it return an exception and making the user write 'throw error("blah")', nor whether "undefined" is that much more useful than just throwing an UndefinedException, but this does illustrate the relationship between the bottom type and functions that, in Haskell terms, compute "forall a.a" or in C#/Java/C++ terms return the type parameter A without taking an A as an argument.</p><p>Also, as always care should be taken when transliterating from one language to another. Java would allow a function like "undefined" to return a null instead of diverging. C++ would allow it to return anything all and would only fail to compile if it were used in an incompatible way. That contrasts with languages like Haskell and ML in which the only way to implement "undefined :: a" is to make it diverge in some form or fashion[5].<h3>The Bottom Value</h3>I've spent some time talking about the bottom type as having no values. But it does have expressions like "undefined()" and that leads to a rather philosophical notion of how to think of bottom as having a value. Sorta. Skip this section if you don't like total gray hair philosophizing. If you're brave, then stick with me and imagine a subset of your favorite C derived language that does not allow side effects. No mutation, no IO, and no exceptions. In this strange world functions either just compute values or they diverge by not halting. In such a world the order of evaluation mostly isn't important as long as data dependencies are met - in f(a(), b()), you can compute a() before b() or b() before a(), whatever, they can't interfere with each other. Doesn't matter. The only time order of evaluation is forced on us is when there are data dependencies, so "a() + b()" must compute a() and b() (or b() and a()) before their sum can be computed. Nice. </p><p>Well, almost. Order of evaluation can matter for expressions that diverge. Let me give an example.<pre name="code" class="c">int infiniteLoop() {return infiniteLoop();} int f(int x) {return 42;} int result = f(infiniteLoop());</pre></p><p>Because f ignores its argument, if we call "f(infiniteLoop());" there are two possible outcomes. If "infiniteLoop()" is eagerly evaluated before f is called then the program will diverge. On the other hand, if the expression "infiniteLoop()" is lazily remembered as being potentially needed later then f can successfully return 42 and the diverging expression can be forgotten just like it never happened (because it didn't).</p><p>We've gone to the pain of eliminating side effects from our language so it's a little irritating to have to keep thinking about evaluation order just to deal with divergence, so we perform a little mental trick; a little white lie. Imagine that functions like infiniteLoop above don't get stuck, they compute a value called &#8869;, which is the only value of type bottom.</p><p>Now, since the bottom type is a subtype of all types and &#8869; is the bottom value, then it follows that every type must be extended with &#8869;. Boolean = {&#8869;, true, false}, int = {&#8869;, 0, 1, -1, 2, -2, ...}, and unit = {&#8869;, ()}. That means we need some rules for &#8869; and how it interacts with everything else. In the vast majority of languages including Java, C#, and C++ but also impure functional languages like F# and OCaml doing just about anything with &#8869; can only compute &#8869;. In other words, for all functions f, f(&#8869;) = &#8869;. If you write f(infiniteLoop()) in those languages then the result is &#8869;. This kind of rule is called "strictness".</p><p>In contrast, Haskell is often called a "lazy language," meaning that expressions aren't evaluated until they're needed. That's not quite technically correct. The Haskell spec just says that it is "non-strict." The spec doesn't care when expressions are evaluated so long as programs allow &#8869; to slide as far as possible. An expression like f(infiniteLoop) must evaluate to 42. Haskell basically forces an expression involving &#8869; to evaluate to &#8869; only when the argument must be used[6]. The distinction between "lazy" and "non-strict" is subtle, but by being "non-strict" rather than "lazy" a Haskell compiler can use eager evaluation anytime it can prove that doing so doesn't change behavior in the face of &#8869;. If a function always uses its first argument in a comparison, then Haskell is free to use eager evaluation on that argument. Since Haskell truly does forbid side effects(unlike our imagined neutered language above), the choice of evaluation strategy is up to the compiler and invisible except for performance consequences[7].</p><p>C++, Java, and C# have just a tiny bit of non-strictness. In these languages "true || &#8869;" is true and "false && &#8869;" is false. If these languages were totally strict then "true || &#8869;" would be &#8869;. Users of these languages call this behavior "short circuiting" and it's done for performance reasons rather than being a philosophical goal, but it's still a curious departure from their normal rules.</p><p>There you have it. The bottom value &#8869; is a clever mental hack to allow purely declarative functional code to be reasoned about without injecting sequencing into the logic. It allows people to talk about the difference between a purely declarative strict language and a purely declarative non-strict language without getting into details of evaluation order. But since we're talking about languages that aren't so purely declarative, we can take off our philosopher hats and return back to the world where side effects are unrestricted, bottom is a type with no values and divergence means that flow of control goes sideways.<h3>Tuples and Aesthetics</h3>Last time I talked about the unit type and how, if you interpret types as logical propositions, the unit type behaves as a "true" and tuple types act as logical and "&#8743;." I also talked about an algebraic interpretation where unit type acts like 1 and tuple types act like multiplication "&#215;". So the type (A,B) can also be written as A&#8743;B or A&#215;B. The type (A,unit) is isomorphic A, A&#215;1 = A, and A&#8743;True &lt;=&gt; A.</p><p>Bottom has similar interpretations as 0 or False. The type (A,bottom) is isomorphic to the bottom type because you can't compute any values of type (A,bottom). A&#215;0 = 0, and A&#8743;False &lt;=&gt; False. Nice how it all hangs together, eh?</p><p>Bottom behaves like False in another way. In logic if you assume that False is true you can prove anything. Similarly in type systems the bottom type allows the programmer to promise to do even the impossible. For instance, here's a Java function signature that promises to convert anything to anything.<pre name="code" class="java">public static &lt;A,B&gt; B convert(A arg) {...}</pre>If you ignore dynamic casting and null (they're both weird in different ways) there's no meaningful way to implement that function except by diverging. More on this in an upcoming episode.<h3>Conclusion</h3>I somehow don't think anybody will be running into the office saying "I just read an article on the bottom type, so now I know how to solve our biggest engineering challenge." But the bottom type is still interesting. It's a kind of hole in your static type system that follows inevitably from the Turing Halting Problem[8]. It says that a function can't promise to compute a string, it can only promise to not compute something that isn't a string. It might compute nothing at all. And that in turn leads to the conclusion that in a Turing complete language static types don't classify values (as much as we pretend they do) they classify expressions.<h4>Footnotes</h4><ol><li>gcc does do tail call optimization <a href="http://www.ddj.com/cpp/184401756">under some circumstances</a>.</li><li>Oddly, in C, C#, and Java X can't be "void" because its an error to return an expression of type void. C++ does allow void to make writing templates a bit easier.</li><li>Yes, I can make it a void function and not have a return. But again that would illustrate how exit() is different from a throw: throw doesn't require the return type to be void.</li><li>Note I'm saying "subtype" here and not "subclass." int, List&lt;String&gt;, and List&lt;File&gt; are 3 different types. In C++, Java, and C# int doesn't have a class. In Java and C# List&lt;String&gt; and List&lt;File&gt; both come from the same class. Yet bottom must be a subtype of all 3 so it can't be a subclass.</li><li>A plea to my readers: I'm too lazy to see if "forall a.a" is bottom in F# or C#, or if, like Java, null would be allowed. My bet is that null won't be allowed because only Java has the peculiar notion that type parameters must be bound to reference types.</li><li>Very loosely speaking. Please see the Haskell Report for all the gory details about when Haskell implementations are allowed to diverge.</li><li>Haskell has another clever hack where throwing an exception isn't an effect, but catching one is. Since order of evaluation is unspecified in Haskell, the same program with the same inputs could conceivably cause different exceptions. Exceptions are theoretically non-deterministic in Haskell.</li><li>A language that isn't Turing complete can prove termination and does not need a bottom type, but such a language isn't powerful enough to have a program that can interpret the language.</li></ol> <div style="clear: both;"></div> </p></div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-5059709925446668158Fri, 21 Aug 2009 14:56:53 GMTVoid vs Unithttp://james-iry.blogspot.com/2009/07/void-vs-unit.html<div class="post-body entry-content"> <p>I suspect most people who come to one of the statically typed functional languages like SML, OCaml, Haskell, F#, or Scala will have only seen static typing in one of the popular C derived languages like C++, Java, or C#. Some differences in type system become clear fast. One particular spot is the relationship and difference between void and unit.</p><p>So imagine your favorite C derived language. Consider your language's equivalent of the following function definition [1]</p><pre>void printHello(String name) { printf("hello %s", name); }</pre><p>What does the word "void" mean? To any C, C++, Java, C# programmer it means that printHello doesn't return anything. Simple.</p><p>But a functional programmer[2] sees it quite differently. This article will explain the different point of view and why it's useful.</p><h3>What's a Function?</h3><p>It turns out to be very hard to write a single definition of "function" that covers all the ways the word has been used. But to a functional programmer at the very least a function is something that maps inputs to outputs. In the impure functional languages that function may also have an effect like printing "hello." I'll stick with the impure definition in this article.</p><p>Now, when a functional programmer talks about mapping inputs to outputs he or she doesn't mean taking a string as an input and outputting it on the console. Instead he or she is talking about arguments and returns of a function. If something is to have any pretense at being a function, or at least a function that returns normally, then it must return something.</p><p>printHello clearly does return normally so a functional programmer insists that it returns something. But a functional programmer also must recognize that it doesn't return anything "meaningful."</p><p>Thus rather than the concept "void," a functional programmer thinks of unit. In some languages the unit type is written as (), but for clarity I'm going to stick with "unit." Now, unit is a type, conceptually not much different from say "int". But unit, as its name somewhat implies, has exactly one value conventionally represented as an empty pair of parentheses ().</p><p>In OO terms () is a "singleton". There's only one instance of it. It's also immutable. In a very real sense it carries no information. You can't write a program that behaves differently based on what a function of type unit returns - it always returns the same ().</p><h3>A Performance Diversion</h3><p>At this point a little guy in the back of your head might be asking "doesn't returning something meaningless just waste cycles?" Well, it's important to distinguish between abstraction and implementation. The unit value, (), is an abstraction that the implementation is free to optimize away. If you write in Scala</p><pre>def printHello : Unit = { println("hello") () }</pre><p>Then the compiler will generate the equivalent of the original void based code. If in Scala you write</p><pre>print(printHello)</pre><p>Then the compiler might generate something like</p><pre>printHello print( () )</pre><p>Because () is a singleton it doesn't matter at the machinecode/bytecode level that it's not "actually" returned. The compiler can figure it out with no performance hit at all.</p><h3>Making It Useful</h3><p>Okay, blah, blah, blah, functional think this, unit that. Whatever. What's the point? () is a value with no information, unit is a type with only that value...what's the practical use?</p><p>Well, it turns out that its useful for writing generic[3] code. Java and C++, somewhat famously, do not have first class functions. But they can be faked. Here's the Java</p><pre>interface Function&lt;Arg, Return&gt; { Return apply(Arg arg); } Function&lt;String, Integer&gt; stringLength = new Function&lt;String,Integer&gt; { Integer apply(String arg) { return arg.length(); } }</pre><p>In C++ you can do it essentially the same way with a template and operator() overloading[4] </p><pre>template &lt;typename Arg, typename Result&gt; struct Function { virtual Result operator()(Arg arg) = 0; }; struct StringLength : Function&lt;string, int&gt; { int operator()(string arg) { return string.length(); } } Function&lt;int, string&gt; *stringLength = new StringLength();</pre><p>C# has a function type and lambdas so it's very easy to write.</p><pre>Func&lt;string, int&gt; stringLength = s =&gt; s.Length;</pre><p>So, if stringLength is an object of type Function&lt;String, int&gt; / Function&lt;int, string&gt; * / Func&lt;string, int&gt; then I can write the following</p><pre>int result = stringLength.apply("hello"); // Java int result = (*stringLength)("hello"); // C++ int reulst = stringLength("hello"); // C#</pre><p>So here's the question: how would I convert the printHello function earlier into a Function&lt;String,X&gt;/Function&lt;string, X&gt; */Funct&lt;string, X&gt;. What type is X?</p><p>If you've been playing along so far you recognize that it "should be" void, but Java and C# don't allow it. It's invalid to write Function&lt;String, void&gt; in either language. </p><p>Let's pretend you could. After all, C++ will let you write a class PrintHello : Function&lt;std::string, void&gt;. But there's still a problem even in C++. Generic code might expect a result and none of these languages let you denote a value of type void. For instance (in Java) imagine thise simple bit of reusable code.</p><pre>public static &lt;A,B&gt; List&lt;B&gt; map(List&lt;A&gt; in, Function&lt;A,B&gt; f) { List&lt;B&gt; out = new ArrayList&lt;B&gt;(in.size()); for (A a : in) { out.add(f.apply(a)); } return out; }</pre><p>C++ has std::transform and C# has IEnumerable#Map, which are more or less the same idea.</p><p>Anyway, given something like the above calling map/transform/Map with a list of strings and the printHello object should result in a List&lt;void&gt; of the same length as the original list. But what's in the list? The answer, in a functional language, is a bunch of references to (). Tada, the meaningless is useful for writing generic code. </p><p>C++ will allow a Function&lt;std::string, void&gt; but any attempt to call something like map with it will fail to compile. C++ doesn't have a () value so our generic code has become slightly less generic.</p><p>The Work Arounds</p><p>Java, C++, and C# programmers do solve this problem all the time. One common option is to not use Function&lt;string, void&gt; but some other concept like "Action&lt;string&gt;" to mean a function that takes a string, performs a side effect, and returns nothing useful. Then you essentially duplicate "map" into something called perhaps "foreach" that expects an Action&lt;T&gt; and returns void. That probably makes sense in this case since a list of references to a meaningless value is perhaps silly, but it also means a great deal of code duplication if this kind of higher order programming is common. For instance, you can't write just one compose function that creates a function from two other functions, you also have to write a compose that takes an action and a function to create an action.</p><p>Another answer is to make the printHello function object have type Function&lt;String,Object&gt;/Function&lt;string,void*&gt;/Func&lt;string,object&gt; and return a null. That's pretty disgusting. </p><p>Perhaps the best option is to create your own Unit enumeration with only a single value. That does leave Unit nullable in Java, but railing against nullability in Java is another blog post. </p><p>enum Unit { unit }; // C++, Java, or C#</p><p>As for good old C, well, the type system doesn't do parametric polymorphism so generic code in C is problematic no matter what you do. Still, with function pointers it's conceivable you might find a use for a unit value. Typically a C programmer would use 0.</p><h3>A Quick Aside About Purity</h3><p>For the most part I've casually used effects everywhere. Most functional languages like SML, OCaml, F#, and Scala are impure and let the programmer get away with that. Some functional languages, e.g. Haskell without unsafe* extensions, do not. They must track effects using the type system. But even with this caveat, the unit type is still useful to, for instance, distinguish from a function that fetches a string from the keyboard which would have might have type IO String vs one that prints to the console which might have type String -&gt; IO Unit.</p><h3>Conclusion</h3><p>Void and unit are essentially the same concept. They're used to indicate that a function is called only for its side effects. The difference is that unit actually has a value. That means that unit functions can be used generically wherever a generic first class function is required. The C derived languages miss out by treating void functions as functions that don't return anything instead of as functions that don't return anything meaningful.</p><p>Next time I'll talk about functions that really, truly return nothing.</p><h4>Postscript on Aesthtics and Tuples</h4><p>Mostly I've talked about the practical value of the unit type in this article, but I do love it when convenient practice and beautiful theory line up. So here's an attack from a different direction.</p><p>Many functional languages have tuples. A tuple is an ordered fixed size collection of values of (possibly) different types[5]. Tuple types are usually written with parens. For instance, (Foo, Bar) is the type of pairs (2-tuples) with one Foo, one Bar. The same type is sometimes written as Foo &#215; Bar, which makes sense too. You can see the type as a set that consists of a kind of cross product of the Foo and Bar sets. For instance if Foo has only two possible values {f1, f2} and Bar has only two possible values {b1, b2} then the type Foo &#215; Bar has 4 possible values, {(f1, b1), (f1, b2), (f2, b1), and (f2, b2)}.</p><p>You can have tuples that with higher numbers of members like triples and 4 tuples and such.</p><p>What about a 1-tuple? Well, the type Foo can be seen as a 1-tuple. There's would be no useful difference between the type Foo and the type (Foo).</p><p>The question to ask is, given that we have 3-tuples, 2-tuples, and 1-tuples does it make sense to have a 0 tuple? The answer, dear reader, is unit. The unit value is a tuple with no members. That's why Haskell calls both the type and the value "()" - it's like the type (A,B) without the A,B. And just as an empty list and empty set aren't quite "nothing" neither is the empty tuple.</p><p>And here's another interesting thing, type theorists will also call the unit type "1" (not to be confused with the value 1). To see why, look at the type (Foo, unit) which can also be written as Foo &#215; 1. If Foo only has 2 possible values {f1, f2} then the entire set of possible values for the type Foo &#215; 1 is {(f1, ()), (f2, ())}. It's the same 2 values extended with a little bit of "meaningless" information that can be stripped off or added back in as desired. Formally speaking it's isomorphic to the type Foo. To abuse notation slightly, Foo &#215; 1 = Foo. So unit is not only the type that has only 1 value, it's the type that behaves like a 1 at the type level.</p><h4>Footnotes</h4><ol><li>In C of course it would be char *name, in C++ it might be char * or I might use the string and iostream libraries. Whatever. The differences aren't that relevant to this discussion.</li><li>To save myself a lot of time I'm using phrases like "functional language" to mean the statically typed functional languages that have been influenced to a large extent by ML. Some of what I'm talking about does translate into some dynamically typed languages as well, although they often "pun" things so that there isn't a unit value that's distinct from nil, null, false or some other singleton value.</li><li>I use the phrase "generic" here to mean what the functional community means by "parametrically polymorphic" or just "polymorphic." However, my target audience is more likely to associate polymorphism with OO dispatch, so "generic" it is.</li><li>In C++ you can also do it without the pure virtual base class and just use template structural typing based on operator(), but shortly it'll be easier for me to describe a problem if I use a base class to give the concept a name. I could also allocate on the stack but that slightly confuses my point. I'm going this route to keep things symmetrical among the languages. Besides, this isn't a tutorial on why C++ isn't Java. On a related note it's a shame C++0x won't have concepts.</li><li>This is quite different from a Python tuple which is really just a list.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-7037726222994626038Thu, 23 Jul 2009 16:02:43 GMT"Fixing" Groovy?http://james-iry.blogspot.com/2009/07/fixing-groovy.html<div class="post-body entry-content"> <p>In response to my <a href="http://james-iry.blogspot.com/2009/07/groovy-does-not-have-optional-static.html">last article</a> on Groovy's lack of static typing James Strachan, the original Groovy guy, <a href="http://twitter.com/jstrachan/status/2667006971">tweeted</a> "groovy should fix this really IMHO." Well, perhaps. But "fixing" it would substantially change Groovy's semantics. Most importantly it would prevent Groovy from executing some programs that work just fine right now.</p><h3>An Example</h3><p>Here's a simple example of a program that works now but wouldn't under a static typing regime. It's a simple variation on the program from the previous article.</p><pre>interface Foo {} interface Bar {} class Baz implements Foo, Bar {} Foo test(Bar x) {return x} test(new Baz()) println("Everything's Groovy")</pre><p>Compile it, run it, and feel groovy.</p><pre>~/test$ groovyc test.groovy ~/test$ groovy test Everything's Groovy</pre><p>If I transliterate into Scala</p><pre>trait Foo trait Bar class Baz extends Foo with Bar def test(x : Bar) : Foo = x test(new Baz) println("Everything's Scala?")</pre><p>And try to compile</p><pre>~/test$ scalac -Xscript test test.scala !!! discarding &lt;script preamble&gt; (fragment of test.scala):5: error: type mismatch; found : this.Bar required: this.Foo def test(x : Bar) : Foo = x ^ one error found</pre><p>Scala rejects a program that Groovy is happy with.</p><h3>Conclusion</h3><p>That's the nature of the beast. Static typing rejects some good programs that might otherwise work. And as my last article showed, dynamic typing allows some bad programs that can't possibly work. That's a pretty fundamental difference and there's no way around it.</p><p>If the Groovy community wants to make its type annotations work in a more statically typed manner that's ok. This article just shows that such a change wouldn't be 100% compatible with today's Groovy.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-7140048831795003215Thu, 16 Jul 2009 09:41:37 GMTGroovy Does Not Have Optional Static Typinghttp://james-iry.blogspot.com/2009/07/groovy-does-not-have-optional-static.html<div class="post-body entry-content"> <p>Every now and then somebody writes that Groovy supports optional static typing. But it doesn't. They've confused the use of type annotations on variables with a static type system.</p><p>This article is emphatically not an entry in the long, sad, and mostly insipid static vs. dynamic debate. It is a clarification of a misunderstanding.</p><h3>Definitions</h3><p>The word "static" in "static type system" is related to phrases like "static analysis." Basically, without executing the code in questions, a static type system analyzes code to prove that certain kinds of misbehaviors won't ever happen. </p><p>The word "dynamic" in "dynamic type system" is related to the "dynamic execution" of a program. In contrast to a static type system, a "dynamic type system" automatically tests properties at when expressions are being (dynamically) evaluated.[1]</p><h3>The Proof</h3><p>With those definitions its easy to show that adding type annotations to a Groovy program does not make it statically typed. Create a file named test.groovy. In it define a couple of unrelated classes</p><pre>class Foo {} class Bar {}</pre><p>Now write a function that promises to take a Bar and return a Foo. Add a println for fun.</p><pre>Foo test(Bar x) { return x } println("Everything's groovy")</pre><p>Compile it and run it (explicitly compiling isn't necessary, it just reinforces my point)</p><pre>~/test$ groovyc test.groovy ~/test$ groovy test Everything's groovy</pre><p>There were no complaints at all. A static type system would have rejected the program.</p><p>Open up test.groovy and make a slight adjustment so that the function is actually called</p><pre>Foo test(Bar x) {return x} println("Everything's groovy") test(new Bar())</pre><p>Compile and run this</p><pre>~/test$ groovyc test.groovy ~/test$ groovy test Everything's groovy Caught: org.codehaus.groovy.runtime.typehandling.GroovyCastException: Cannot cast object 'Bar@861f24' with class 'Bar' to class 'Foo' at test.test(test.groovy:4) at test.run(test.groovy:6) at test.main(test.groovy)</pre><p>A runtime exception is finally thrown at the (dynamic) moment when the test function is executed. That's dynamic checking rather than any form of static proof.</p><h3>Actual Static Typing</h3><p>Now open a file named test.scala.</p><pre>class Foo class Bar def test(x : Bar) : Foo = x</pre><p>Compile as a script </p><pre>~/test$ scala -Xscript test test.scala (virtual file):1: error: type mismatch; found : this.Bar required: this.Foo object test { ^ one error found</pre><p>Ta da, static typing. With no attempt to execute my test function the type checker says "this will never work."</p><h3>The Misunderstanding</h3><p>There's an oft repeated phrase that "in a statically typed language variables are typed." There's also a common misunderstanding that static typing and type annotations are the same thing. I can show a simple counter example to both misconceptions with one language: Haskell.</p><p>Create a file named Test.hs. Here's a function called flatten. It flattens a list of lists one "level" to just be a list.[2]</p><pre>flatten = foldl (++) []</pre><p>No variables were harmed during the creation of that function. It's written in what's called "point free" style. Yet with neither variables nor type annotations I can show it's statically typed by defining a bogus main function</p><pre>main = print $ flatten [1,2,3,4,5,6]</pre><p>And trying to compile</p><pre>~/test$ ghc Test.hs -o test Test.hs:3:24: No instance for (Num [a]) arising from the literal `1' at Test.hs:3:24 Possible fix: add an instance declaration for (Num [a]) In the expression: 1 In the first argument of `flatten', namely `[1, 2, 3, 4, ....]' In the second argument of `($)', namely `flatten [1, 2, 3, 4, ....]'</pre><p>Thus you need neither variables nor annotations to have static typing. A small change fixes things right up</p><pre>main = print $ flatten [[1,2,3],[4,5,6]] ~/test$ ghc Test.hs -o test ~/test$ ./test [1,2,3,4,5,6]</pre><p>It's easy enough to see what type GHC has assigned the function. Just add a module declaration at the very top of the file</p><pre>module Main (flatten, main) where</pre><p>And fire up ghci</p><pre>~/test$ ghci Loading package base ... linking ... done. Prelude&gt; :load Test Ok, modules loaded: Main. Prelude Main&gt; :type flatten flatten :: [[a]] -&gt; [a]</pre><p>Exactly as described, it takes a list of lists and returns a list.</p><h3>Groovy Does Have Static Typing</h3><p>Having shown that Groovy does not have static typing let me show that it does have static typing. :-) At least, it has static typing for some kinds of things. Open up test.groovy again and add this</p><pre>class MyRunnable implements Runnable {}</pre><p>Compile that and you get an error</p><pre>~/test$ groovyc test.groovy org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed, test.groovy: 8: Can't have an abstract method in a non-abstract class. The class 'MyRunnable' must be declared abstract or the method 'void run()' must be implemented. @ line 8, column 1. class MyRunnable implements Runnable {} ^ 1 error</pre><p>The error happens without executing anything so it's a form of static analysis, and it's caused by failure to adhere to the Runnable type's requirements hence it's a static type error. But it's not an optional static type check - you can't turn it off.</p><p>Interesting, no?</p><h3>Conclusion</h3><p>Whatever you feel positively or negatively about Groovy, please don't perpetuate the myth that it has optional static types. It doesn't. Groovy's optional type annotations are used for dynamic testing not static proofs. It's just that with type annotations Groovy dynamically tests are for "dynamic type" tags attached to objects rather than its more normal mechanism of dynamically testing for the presence of certain methods. What little static typing Groovy does have is not optional. </p><h4>Footnotes</h4><p>[1] In a certain formal sense the phrase "dynamic type system" is an oxymoron and "static type system" is redundant. However, the phrases are commonly used and there doesn't seem to be a commonly recognized concise phrase to describe what a "dynamic type system" does. Pedants will argue for "untyped" but that doesn't give me a concept on which to hang the different ways that say Groovy, Ruby, and Scheme automatically deal with runtime properties.</p><p>[2] A more general function called join already exist for all monads in Control.Monad module. It can be defined as "join m = m &gt;&gt;= id"</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-2560089181381669872Wed, 15 Jul 2009 17:06:35 GMTErlang is Not Functionalhttp://james-iry.blogspot.com/2009/05/erlang-is-not-functional.html<div class="post-body entry-content"> <p>Robert Fischer claims <a href="http://enfranchisedmind.com/blog/posts/scala-not-functional/">Scala is not a functional language</a>. But if you go by his post then Erlang isn't either.</p><h3>Modules</h3><p>Fischer says<blockquote cite="http://enfranchisedmind.com/blog/posts/scala-not-functional/"> In OCaml, we define a function like this:<pre>let f x = x + 1;;</pre>In Scala, though, we define the function somewhat differently:<pre> object SomeHolderObject { val f(x:int):int = { x + 1 } }</pre></blockquote></p><p>Now some Erlang</p><pre> -module(SomeHolderModule). -export([f/1]). f(X) -> X + 1. </pre><p>Erlang requires a function to be defined in a module. Scala calls a module an "object."</p><h3>Currying and Partial Application</h3><p>Fischer says [1]</p><p><blockquote cite="http://enfranchisedmind.com/blog/posts/scala-not-functional/">For the shut-out, let's see in Scala makes functional programming easy. Let's start with my favorite functional language stunt: currying. OCaml, you're up to bat first. <pre> let x a b = a + b;; let y = x 1;; y 2;; (* 3 *) </pre> Not too shabby. Scala, show your stuff! <pre> def x(a:Int, b:Int) = a + b def y = x(1) /* ERK! FAIL! &lt;console&gt;:5: error: wrong number of arguments for method x: (Int,Int)Int def y = x(1) ^ */ </pre> </blockquote></p><p>And here it is in Erlang</p><pre> - module(test). - export([f/2]). f(X,Y) -&gt; X + Y. 1&gt; c(test). {ok,test} 2&gt; test:f(1). ** exception error: undefined function test:f/1 </pre><h3>Algebraic Data Types</h3><p><blockquote>What about Algebraic Data Types, common to functional languages - including my beloved OCaml variant types?</blockquote></p><p>Erlang doesn't have them. You can fake them with tuples and atoms and such, but they just aren't part of Erlang.</p><h3>Conclusion</h3><p>Clearly, by Fischer's definition, Erlang is not a functional language.</p><p>But that's ridiculous.</p><p>David MacIver has an excellent post suggesting that there's <a href="http://www.drmaciver.com/2009/05/a-problem-of-language/">a problem of language</a>. Fischer hasn't defined what he means by "functional language" other than "feelings" so we're left with something nebulous.</p><p>I think MacIver is right, but Fischer does seem to have an informal definition. His definition of "functional language" is "OCaml". His post compares Scala to OCaml and concludes that because Scala is not OCaml Scala must not be functional. </p><p>Yet if I take his specific points I can find other functional languages, like Erlang, that are not OCaml either and therefor not functional either.</p><p>In fact, if I look at Common Lisp, Scheme, Clojure and Arc I find that they violate most of his demands too. So the whole Lisp family isn't functional either.</p><p>Fischer has fallen prey to a common mistake. He's overgeneralized from one datapoint. It's as if a Scheme programmer concluded that OCaml isn't functional because it doesn't have hygienic macros or a Java programmer concluded that Ruby isn't object oriented because it doesn't have inner classes.</p><h4>Footnotes</h4><p>[1] Actually, you can write "def f(a:Int)(b:Int) = a + b" in Scala if you want a curried function or you can write "val y = f(1, _)" if you want to partially apply an uncurried function. So Fischer blows major points here.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-5126972453166458305Fri, 15 May 2009 12:56:40 GMTA Brief, Incomplete, and Mostly Wrong History of Programming Languageshttp://james-iry.blogspot.com/2009/05/brief-incomplete-and-mostly-wrong.html<div class="post-body entry-content"> <p>1801 - Joseph Marie Jacquard uses punch cards to instruct a loom to weave "hello, world" into a tapestry. Redditers of the time are not impressed due to the lack of tail call recursion, concurrency, or proper capitalization.</p><p>1842 - Ada Lovelace writes the first program. She is hampered in her efforts by the minor inconvenience that she doesn't have any actual computers to run her code. Enterprise architects will later relearn her techniques in order to program in UML.</p><p>1936 - Alan Turing invents every programming language that will ever be but is shanghaied by British Intelligence to be 007 before he can patent them.</p><p>1936 - Alonzo Church also invents every language that will ever be but does it better. His lambda calculus is ignored because it is insufficiently C-like. This criticism occurs in spite of the fact that C has not yet been invented.</p><p>1940s - Various "computers" are "programmed" using direct wiring and switches. Engineers do this in order to avoid the tabs vs spaces debate.</p><p>1957 - John Backus and IBM create FORTRAN. There's nothing funny about IBM or FORTRAN. It is a syntax error to write FORTRAN while not wearing a blue tie.</p><p>1958 - John McCarthy and Paul Graham invent LISP. Due to high costs caused by a post-war depletion of the strategic parentheses reserve LISP never becomes popular[1]. In spite of its lack of popularity, LISP (now "Lisp" or sometimes "Arc") remains an influential language in "key algorithmic techniques such as recursion and condescension"[2].</p><p>1959 - After losing a bet with L. Ron Hubbard, Grace Hopper and several other sadists invent the Capitalization Of Boilerplate Oriented Language (COBOL) . Years later, in a misguided and sexist retaliation against Adm. Hopper's COBOL work, Ruby conferences frequently feature misogynistic material.</p><p>1964 - John Kemeny and Thomas Kurtz create BASIC, an unstructured programming language for non-computer scientists.</p><p>1965 - Kemeny and Kurtz go to 1964.</p><p>1970 - Guy Steele and Gerald Sussman create Scheme. Their work leads to a series of "Lambda the Ultimate" papers culminating in "Lambda the Ultimate Kitchen Utensil." This paper becomes the basis for a long running, but ultimately unsuccessful run of late night infomercials. Lambdas are relegated to relative obscurity until Java makes them popular by not having them.</p><p>1970 - Niklaus Wirth creates Pascal, a procedural language. Critics immediately denounce Pascal because it uses "x := x + y" syntax instead of the more familiar C-like "x = x + y". This criticism happens in spite of the fact that C has not yet been invented.</p><p>1972 - Dennis Ritchie invents a powerful gun that shoots both forward and backward simultaneously. Not satisfied with the number of deaths and permanent maimings from that invention he invents C and Unix.</p><p>1972 - Alain Colmerauer designs the logic language Prolog. His goal is to create a language with the intelligence of a two year old. He proves he has reached his goal by showing a Prolog session that says "No." to every query.</p><p>1973 - Robin Milner creates ML, a language based on the M&M; type theory. ML begets SML which has a formally specified semantics. When asked for a formal semantics of the formal semantics Milner's head explodes. Other well known languages in the ML family include OCaml, F#, and Visual Basic.</p><p>1980 - Alan Kay creates Smalltalk and invents the term "object oriented." When asked what that means he replies, "Smalltalk programs are just objects." When asked what objects are made of he replies, "objects." When asked again he says "look, it's all objects all the way down. Until you reach turtles."</p><p>1983 - Bjarne Stroustrup bolts everything he's ever heard of onto C to create C++. The resulting language is so complex that programs must be sent to the future to be compiled by the Skynet artificial intelligence. Build times suffer. Skynet's motives for performing the service remain unclear but spokespeople from the future say "there is nothing to be concerned about, baby," in an Austrian accented monotones. There is some speculation that Skynet is nothing more than a pretentious buffer overrun.</p><p>1986 - Brad Cox and Tom Love create Objective-C, announcing "this language has all the memory safety of C combined with all the blazing speed of Smalltalk." Modern historians suspect the two were dyslexic.</p><p>1987 - Larry Wall falls asleep and hits Larry Wall's forehead on the keyboard. Upon waking Larry Wall decides that the string of characters on Larry Wall's monitor isn't random but an example program in a programming language that God wants His prophet, Larry Wall, to design. Perl is born.</p><p>1990 - A committee formed by Simon Peyton-Jones, Paul Hudak, Philip Wadler, Ashton Kutcher, and People for the Ethical Treatment of Animals creates Haskell, a pure, non-strict, functional language. Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that "a monad is a monoid in the category of endofunctors, what's the problem?"</p><p>1991 - Dutch programmer Guido van Rossum travels to Argentina for a mysterious operation. He returns with a large cranial scar, invents Python, is declared Dictator for Life by legions of followers, and announces to the world that "There Is Only One Way to Do It." Poland becomes nervous.</p><p>1995 - Yukihiro "Mad Matz" Matsumoto creates Ruby to avert some vaguely unspecified apocalypse that will leave Australia a desert run by mohawked warriors and Tina Turner. The language is later renamed Ruby on Rails by its real inventor, David Heinemeier Hansson. [The bit about Matsumoto inventing a language called Ruby never happened and better be removed in the next revision of this article - DHH].</p><p>1995 - Brendan Eich reads up on every mistake ever made in designing a programming language, invents a few more, and creates LiveScript. Later, in an effort to cash in on the popularity of Java the language is renamed JavaScript. Later still, in an effort to cash in on the popularity of skin diseases the language is renamed ECMAScript.</p><p>1996 - James Gosling invents Java. Java is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. Sun loudly heralds Java's novelty.</p><p>2001 - Anders Hejlsberg invents C#. C# is a relatively verbose, garbage collected, class based, statically typed, single dispatch, object oriented language with single implementation inheritance and multiple interface inheritance. Microsoft loudly heralds C#'s novelty.</p><p>2003 - A drunken Martin Odersky sees a Reese's Peanut Butter Cup ad featuring somebody's peanut butter getting on somebody else's chocolate and has an idea. He creates Scala, a language that unifies constructs from both object oriented and functional languages. This pisses off both groups and each promptly declares jihad.</p><h4>Footnotes</h4><ol><li>Fortunately for computer science the supply of curly braces and angle brackets remains high.</li><li><a href="http://www.theregister.co.uk/2006/01/11/exception_handling/">Catch as catch can</a> - Verity Stob</li></ol><h4>Edits</h4><ul><li>5/8/09 added BASIC, 1964</li><li>5/8/09 Moved curly brace and angle bracket comment to footnotes</li><li>5/8/09 corrected several punctuation and typographical errors</li><li>5/8/09 removed bit about Odersky in hiding</li><li>5/8/09 added Objective-C, 1986</li><li>5/8/09 added Church and Turing</li></ul> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-4502078731021642654Fri, 08 May 2009 19:21:27 GMTErlang Style Actors Are All About Lockinghttp://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about_16.html<div class="post-body entry-content"> <p>I managed to offend a few Erlangers by talking about actors as being shared stateful objects a couple of articles back. Now I'm going to continue the tradition by talking about the actor locking model. And I'm not just talking about Erlang, but any of the libraries that have sprung up for other languages to support actor style concurrency. Let me emphasize again that I like both Erlang and actors. But I dislike the number of times I read from enthusiasts that you "can't have deadlocks in Erlang because it doesn't have any locks." Of course you can because it does. This article will explain.</p><h3>Locks</h3><p>There are all manner of locks in the world: mutexes, monitors, synchronization barriers, read/write locks etc., etc., yada yada yada. All of them boil down to a spot in the program where if conditions aren't good-to-go the currently executing thread stops executing until some other thread (or threads) change the condition the lock is waiting on[1]. Erlang has a locking primitive, too. It stops the execution of the current thread and waits for the right conditions to move on. The primitive is called (drum roll please)...'receive'.</p><p>"But, but, 'receive' receives messages - it's not a lock." Sure, it receives messages. True. But it also brings the currently running thread to a screeching halt until it receives a message it wants to handle. "receive foo -&gt; doSomething()" waits until the atom "foo" is at the front of the current process's mailbox - any other message gets thrown to the back of a "this receive can't handle that message" queue for processing by some other receive. In other words "receive foo -&gt;" is a lock that waits for the condition "foo at head of mailbox queue" before it unlocks.</p><h3>Deadlocks</h3><p>A deadlock is when there's a cycle in the locking dependencies of 2 or more threads. Thread 1 is locked waiting for condition 1 after which it will set condition 2, Thread 2 is waiting for condition 2 after which it will set condition 3, etc until you get Thread n which is waiting for condition n after which it will set condition 1. Everybody's waiting on somebody else to go first so nobody can go.</p><p>Based on this, a deadlock in Erlang is easy to create. Just spawn two actors that are each waiting for a message from the other</p><pre>Foo = spawn(fun() -&gt; receive {From, foo} -&gt; From ! {self(), bar}, io:format("foo done~n") end end). Bar = spawn(fun() -&gt; receive {From, bar} -&gt; From ! {self(), foo}, io:format("bar done~n") end end).</pre><p>If a running system doesn't have any other actors that will send Foo or Bar the right message then they are deadlocked until some poor sleepy eyed admin comes along to break the deadlock with "Foo ! {Bar, foo}". Now, in this case the deadlock is not a big deal. The resources these actors consume are pretty tiny and they don't really do anything useful even when they unlock. But if these actors were meant to do something actually useful such a deadlock could seriously impair the functionality of a system.</p><p>This was a toy example, of course. It's easy to see the deadlock. In a "real world" large scale application it might not be so obvious where the deadlocks lurk. Actor's can end up in very complicated, dynamically changing relationships. Couple that with the possibility for race conditions and you'll see that deadlock is a real concern. Erlang style actors make concurrency much easier to deal with, but they don't remove the need for careful design and testing.</p><p>That's pretty much all I really wanted to say in this article, but I can't feel my work here is done until I've committed further atrocities against Erlang.</p><h3>Perverting Erlang</h3><p>A couple of posts ago I had a problem with race conditions on a shared stateful actor. Here's what it looked like.</p><pre>1&gt; c(sockmachine). {ok,sockmachine} 2&gt; Machine = sockmachine:start(). &lt;0.38.0&gt; 3&gt; sockmachine:test(Machine, 2). all test processes launched [2] BONUS COIN! [1] BONUS COIN! ok [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! etc.</pre><p>While there are many good solutions, one evil solution is to use a mutex. You see, one well known aspect of concurrent programming is that once you have one locking primitive, many others are pretty easy to build. To the perversion mobile! </p><pre>-module(mutex). -export([create/0, lock/1, unlock/1]). create() -&gt; spawn(fun() -&gt; unlocked() end). lock(Mutex) -&gt; Mutex ! {lock, self()}, receive ok -&gt; ok end. unlock(Lock) -&gt; Lock ! {unlock, self()}. unlocked() -&gt; receive {lock, LockedBy} -&gt; LockedBy ! ok, locked(LockedBy, 1) end. locked(LockedBy, Count) -&gt; receive {unlock, LockedBy} -&gt; if Count == 1 -&gt; unlocked(); true -&gt; locked(LockedBy, Count - 1) end; {lock, LockedBy} -&gt; LockedBy ! ok, locked(LockedBy, Count + 1) end.</pre><p>When a Mutex actor is the unlocked state then any actor can tell it to lock. Once in the locked state it won't unlock until the first actor tells it to. [2] The counting business is so that one process can lock a mutex several times and then must unlock the mutex the same number of times before it is considered unlocked.</p><p>And now a bit of test code. I've modified the original sockmachine test code so that before inserting coins and pushing buttons a testloop must acquire the lock on a mutex and once done must release it.</p><pre>-module(mutextest). -export([test/3]). test(Machine, Mutex, Processes) -&gt; if Processes &gt; 0 -&gt; spawn(fun() -&gt; testloop(Processes, Machine, Mutex, 10) end), test(Machine, Mutex, Processes - 1); true -&gt; io:format("all test processes launched~n") end. testloop(Process, Machine, Mutex, Count) -&gt; if Count &gt; 0 -&gt; mutex:lock(Mutex), testzerocoins(Process, Machine), mutex:unlock(Mutex), testloop(Process, Machine, Mutex, Count - 1); true -&gt; io:format("[~w] testing completed~n", [Process]) end. testzerocoins(Process, Machine) -&gt; case sockmachine:insertcoin(Machine) of {nothing} -&gt; testonecoin(Process, Machine); {coin} -&gt; io:format("[~w] BONUS COIN!~n", [Process]), testtwocoins(Process, Machine) end. testonecoin(Process, Machine) -&gt; case sockmachine:insertcoin(Machine) of {nothing} -&gt; testtwocoins(Process, Machine); {coin} -&gt; io:format("[~w] BONUS COIN!~n", [Process]), testtwocoins(Process, Machine) end. testtwocoins(Process, Machine) -&gt; case sockmachine:pushbutton(Machine) of {tubeofsocks} -&gt; io:format("[~w] Got my socks.~n", [Process]); {nothing} -&gt; io:format("[~w] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH~n", [Process]) end.</pre><p>And here's what it looks like when I run all this</p><pre>4&gt; c(mutex). {ok,mutex} 5&gt; c(mutextest). {ok,mutextest} 6&gt; Mutex = mutex:create(). &lt;0.53.0&gt; 7&gt; mutextest:test(Machine, Mutex, 2). all test processes launched ok [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [1] Got my socks. [2] Got my socks. [2] testing completed [1] Got my socks. [1] testing completed</pre><p>Ta da! Fixed my problem!</p><h3>Kids, Don't Try This At Home</h3><p>There are better ways to solve this kind of problem with actors. Seriously. Why use a custom mutex when Erlang has a lock built in as described above? Just change the sock machine to not be so stateful, i.e. make it take one message "{From, coin, coin, pushbutton}" and do all the work necessary before returning the tube of socks. Particularly nasty about this manual mutex code is that it's hard to ensure that all code paths will unlock a mutex. If my test code had thrown an exception at the wrong point then the mutex would be left locked. So don't do this. Unless you're evil. Like me.</p><h3>Conclusions</h3><p>Erlang's a great language. Actors are a great model. But they are neither race nor deadlock free. That's because, fundamentally, they're all about shared mutable state and locking. If after all of these articles you still don't believe me or you think I'm using the words wrong then please send your comments to dev/null. </p><p>In my next article I'll try to pervert Erlang for good rather than for evil and maybe try to explain the frequent observation that "yeah, races and deadlocks can happen with Erlang, but it seems less common than with more traditional shared memory/manual locking mechanisms."</p><h4>Footnotes</h4><ol><li>Erlangers use the word "process" because of the memory isolation enforced between different actors. I'm using "thread" here to avoid confusion with OS processes. But even here there's confusion because I mean logical threads which may or may not be the same as hardware threads. I've seen a few libraries call this concept "sparks."</li> <li>Or until some third actor spoofs an unlock message from the locking actor.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-6116630211443016162Wed, 29 Apr 2009 10:21:56 GMTJava Has Type Inference and Refinement Types (But With Strange Restrictions)http://james-iry.blogspot.com/2009/04/java-has-type-inference-and-refinement.html<div class="post-body entry-content"> <p>When enthusiasts talk about Scala they frequently start with "it's a lot like Java but with all this other cool stuff like type inference and structural types and..." I guess it's okay to introduce Scala that way. I've done it myself. But it does give the impression that Scala has somehow heaped complexity onto the Java. Oddly enough, it hasn't, at least not in the areas of type inference and structural types. Java has both. Sorta. They're just not very useful.</p><p>In this short article I'm going to try to show that while type inference and structural types appear to be additions to Scala over and above Java, they can also be explained as the removal of peculiar restrictions from Java. The goal is not just to illustrate these two points, but to make a broad statement: Scala isn't a just bunch of features heaped on top of Java. To a significant degree it's a generalization of things already found in Java. The result is that many aspects of Scala are both simpler and more regular than Java.</p><h3>Type Inference</h3><p>Java folk get excited about Scala's type inference. Usually positively, but sometimes negatively. But here's the thing: every practical statically typed language has some type inference. Every one. Even Java has type inference. It's just very limited. Let me show you by starting with a simple class</p><pre name="code" class="java">class Foo { public Foo bar() { return this; } public String toString() { return "Foo!"; } }</pre><p>With that I can write </p><pre name="code" class="java">System.out.println(new Foo().bar().bar().bar());</pre><p>It's silly code but it illustrates my point. Java doesn't need me to explicitly annotate the return type of each call to bar(). Java infers the type resulting from new Foo().bar() as being a Foo, which in turn can be sent a bar() message resulting in a Foo, etc. If I change things just a little the type system complains</p><pre name="code" class="java">System.out.println(new Foo().bar().bar().baz()); // bzzt error!</pre><p>In Java I "only" have to annotate types when I want to assign to a variable or return from a method. Which, come to think of it, is an odd restriction. The compiler obviously knows the type, so why do I have to tell it yet again?</p><pre name="code" class="java">Foo f = new Foo().bar().bar().bar();</pre><p>Thus, rather than saying Scala adds type inference we can equally well say that it removes a silly restriction from Java's type inference[1]. </p><h3>Structural Refinement Types</h3><p>The fact that you don't have to annotate types when building expressions in Java is well known even if people don't usually call it type inference. What's not as well known is that Java has refinement types. So I'll have to explain, this time starting with Scala.</p><p>In Scala you can say method doSomething takes any object with a bar method, like so</p><pre name="code" class="scala">def doSomething(x : {def bar : Unit}) = ...</pre><p>That's called structural typing - the doSomething method puts a constraint on the structure of objects that may be passed to it. But actually, in Scala, that's just shorthand for the following refinement type</p><pre name="code" class="scala">def doSomething(x : AnyRef {def bar : Unit}) = ...</pre><p>That says that x must be a subtype of AnyRef and must have a bar method. The type of x is a "refinement" of AnyRef with an additional structural constraint. Scala's refinement types aren't limited to just refining AnyRef, but that's really not important to my point here. My point is that Java has refinement types, too.</p><p>If you're a Java programmer, you're probably scratching your head thinking I've lost my sanity. But I promise I haven't. I ask you, in Java what exactly is the type of this expression?</p><pre name="code" class="java">new Object() { public void bar() { System.out.println("bar!"); } }</pre><p>If you said "Object!" think again. In Java it's actually something more sophisticated. Go paste this into your favorite Java IDE and run it. I'll wait</p><pre name="code" class="java">public class Test { public static void main(String[] args) { (new Object() { public void bar() { System.out.println("bar!"); } }).bar(); } }</pre><p>Surprised? Remember my point about limited type inference above? Here, Java has clearly inferred that the result of the expression is an Object refined with a bar method[2]. And so it lets you send a bar() message to the object. Rename either the method or the call to "baz" and you get a type error. But if you want to do anything interesting with the (new Object {public void bar() {...}};) object, like assign it to a variable, return it from a method, or apply it as an argument to a method then Java has a problem. It has no way to denote the refinement. So the refinement gets "lost" from the static type system. Poof. Gone. It's part of the object, you can find it via reflection, but the static type system won't let you get at it. The static type system only lets you get at a refinement "immediately" in a surrounding expression. Which is pretty close to useless. When you look at it that way, Scala hasn't so much added refinement types as removed another silly Java restriction.</p><h3>Conclusion</h3><p>When Java people first start looking into Scala it's common to think of Scala as being some complicated mishmash. In large part, evangelists (including me) are to blame. It's surprisingly hard to convey a nice feature of language A to users of language B without making it sound like something totally new, hence something that will complicate their lives. Hopefully this article has helped illustrate that in a very real sense some of the things that appear to be "new" in Scala are really "old" but generalized to be made more useful.</p><p>So this article is a call to all Scala pundits out there: when comparing Scala to another language, especially Java, go ahead and get excited about a "new" feature...but then see if you can't explain it as a generalization of an "old" one. You might be surprised how often you can and your audience might come away less intimidated.<h4>Foot Notes</h4><ol><li>Technically, Scala's type inference is quite a bit more sophisticated than Java's. But I won't let inconvenient facts like that get in the way of making a good point.</li><li>Technically, I think the Java spec says that the type isn't structural, but some anonymous nominative type that the compiler can infer and use over a very limited scope. Then again, maybe not. I'm too lazy to go re-read the spec. Also, as already established, I won't let inconvenient facts like that get in the way of making a good point.</li></ol> <div style="clear: both;"></div> </p></div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-4829031039331534097Fri, 24 Apr 2009 18:31:24 GMTThe State of Sock Tubeshttp://james-iry.blogspot.com/2009/04/state-of-sock-tubes.html<div class="post-body entry-content"> <p>Oddly, given that state is such a central notion to so much programming, it's awfully misunderstood. How do I know? Because OO programmers think they can make it private and Erlang programmers think they don't share it. If you agree with either of those statements then this article is for you.</p><p>I want to start with a simple, hypothetical vending machines. It dispenses socks in a tube, which is cool, because "tube of socks" is a huge Internet meme. Well, not yet, but will be when this article makes the front of Reddit, Digg, Hacker News, and CosmoGirl.</p><p>So, anyway, the machine has a coin slot, a button, and a dispensing tray. If you put in two coins one after the other and press the button you get a tube of socks. If you insert two coins and then insert a third you get your coin back. Same with fourth, fifth, etc. Pressing the button after putting only one coin or no coins at all results in nothing happening. It's a simple machine, but it's enough to illustrate all my points. A diagram sums it up nicely.</p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s1600-h/path2383.png"><img style="cursor:pointer; cursor:hand;width: 400px; height: 153px;" src="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s400/path2383.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5320614581050825218" /></a><p>This is, you guessed it, a diagram of a state machine. In particular this one is a finite state machine, but that's not important to this article. So I'm wasting time by mentioning it. And more time by explaining that I'm wasting time. The diagram shows 3 states labeled 0 coins, 1 coin, and 2 coins. Between the states are arrows labeled with an action you would take to cause the machine to move from one state to another. Some of the transitions also have an action that the machine takes during that transition like dispensing foot warmers or returning a coin. The diagram assumes an infinite supply of socks. And tubes. Got it? Good. Patent pending. Onward.</p><h3>Hypothetical</h3><p>Imagine I locked you in the room with a bag of coins and that machine. The machine is a sealed, glossy black, and glows with a malevolent sock tube evil. Imagine there's no way for you to see what's inside, no way to know how the machine actually works. Assignment: figure out as much about its internals as you can.</p><p>Even with these restrictions, I'd bet that with only a small amount of experimenting you could easily draw something like the diagram above. I mean, if I unlocked the room long enough for you to get a pen and paper you could draw it.</p><p>True, you'd never be certain you knew all the details. It could be that after dispensing 1,000,000,000,000 pairs of socks it would reward you with a unicorn instead. You also wouldn't know if the machine was written in Ruby or Java. Er, integrated chips or discrete components. So many details could still be hidden. But you would certainly know that the machine had state and that the state changed based on actions you took.</p><h3>You Can't Make State Private</h3><p>So, point #1) you can't make state private. You can hide the details of how the state is represented and you can carefully control the valid state transitions and you can control who has access to change the state, but you can't make the state private. State is the fact that something reacts differently to the same inputs at different times. Sometimes putting a coin in and pushing the button does nothing but sometimes it results in a cool tube of socks. That's state. The gears and transistors are hidden, but the state is observable. If your favorite OO object has all private data but one single method does something different based on changing values in that hidden, private, super secret data then by definition state has been exposed to anything with access to that method or anything that has access to anything that has access to that method or anything that has access to anything that...etc.</p><p>Now, hypothetically the tube sock machine could have some internal counter that counts the total number of coins that has ever been inserted. But if that counter is strictly internal and can never be observed then it's not private state, it's just dead code ... er useless machinery. Make sense? Good. Patent pending. Onward.</p><h3>How You Can Make State Private</h3><p>Having just explained that you can't make state private, I will now explain how you can make state private. Imagine if our sock machine were sealed inside another machine which has a single button and a single dispenser tray. Further, imagine the outer machine has an infinite supply of coins in its guts. Pushing the button on the outer machine causes it to insert two coins into the inner machine, push the inner button, and then dispense the resulting tube of socks. If you were locked into the room with this machine you would never know that inside lives a Rube Goldberg state machine. You would just know that pushing the button gives you socks, every time, no ifs, ands, or buts. To an outside observer this machine is stateless. Only by ripping opens its guts would you know the awful truth.</p><p>The moral is that you can't make state private with a "private" keyword, but you can make it local. In programming world that means you can implement something stateless using "local" mutable variables. State is relative to an observer. If you can't observe it changing, then it's not state as far as you're concerned. That's why it's perfectly valid to say that the pure parts of Haskell programs are done without state even though thunks are memoized. If that was gibberish to you, I apologize, it was gibberish to me too.</p><h3>The End, Or Is It?</h3><p>This long ass article was really just a prequel to the real article coming soon which will explain why Erlang style actor "shared nothing" message passing concurrency is all about shared state. Shocked? Good. Patent pending. <div style="clear: both;"></div> </p></div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-8551027449583272617Fri, 17 Apr 2009 07:29:35 GMTErlang Style Actors Are All About Shared Statehttp://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about.html<div class="post-body entry-content"> <p>Actors have become quite the popular topic. Besides Erlang, there's a famous library implementation in Scala and at least 3 for Java. But the "no shared state" propaganda is setting people up for failure. In <a href="http://james-iry.blogspot.com/2009/04/state-of-sock-tubes.html">last week's exciting episode</a> I defined both what state is and what it isn't. It is the fact that something responds differently to the same inputs over time. It is not the details of how that is represented. Based on that I can now show why saying that Erlang style actors don't share state is totally wrong headed - that, in fact, if you don't want to share state the actor model is a very clunky way of doing things.</p><p>Before I go on, let me emphasize that I like Erlang. I like actors, too. It's a very nifty model. But it's not a stateless model. Let me show what I'm ranting against:</p><ul><li><a href="http://www.javaworld.com/javaworld/jw-02-2009/jw-02-actor-concurrency1.html?page=1">"The actor model consists of a few key principles: No shared state"</a></li><li><a href="http://www.defmacro.org/ramblings/concurrency.html">"Erlang is an inherently stateless system."</a></li><li><a href="http://www.pragprog.com/articles/erlang">"Erlang does not have [...] 'mutable state'"</a></li></ul><p>The problem is that Erlang style actors can have shared, mutable (changeable), state. The model is all about shared state. If you didn't need to share mutable state then there are better alternatives than actors. And, importantly, even when you are using actors well you must think about all of the problems of shared mutable state. All of them. People who believe the propaganda are in for rude shocks.</p><h3>The Situation</h3><p>Last time I described a simple sock tube dispensing machine. Insert two coins, press the button, get a tube of socks. Insert too many coins and you get the extra coins returned. Push the button before you've inserted enough coins and you get nothing. Here's the diagram.</p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s1600-h/path2383.png"><img style="cursor:pointer; cursor:hand;width: 400px; height: 153px;" src="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s400/path2383.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5320614581050825218" /></a><p>Imagine that Alice and Bob work at Crap Corp. ("Making High Quality Crap Since 1913"). Once a day they each like to saunter down to the break room and purchase a nice warm tube of socks. But, this is Crap Corp. and the machines don't take regular coins but Crap Corp. Coins instead. Each CCC weighs 40 pounds (very roughly 18.1436948 kilograms).</p><p></p><p>Naturally, Alice and Bob don't want to carry 80 pounds of crappy tokens around with them so they each laboriously drag a token down to a machine, insert it, walk back to their cubicle, grab another and repeat. Now, if Alice and Bob take their sock breaks at very different times that's probably going to work fine. But if they tend to overlap bad things happen. It's possible for Alice to insert her first coin, Bob to insert his first coin, Alice to insert her second coin and get an coin back (BONUS! she cries happily, experiencing the greatest joy she's ever experienced at Crap Corp.) So Alice pushes the button, gets her tube of socks, and merrily skips back to her cube. Well, maybe not skip exactly, but whatever you do when you're ecstatically happy while carrying 40 pounds of crap.</p><p>Then Bob shows up, inserts his second coin, eagerly smashes the button to get his well deserved tube of socks and ... gets nothing. Feeling cheated he pounds on the machine, kicks it, shakes it, and rocks it back and forth. Inevitably, the machine tips over, falls on Bob, and crushes him with all the tons of Crap Corp. Coins that have been inserted over the weeks. A tragic ending for somebody who just wanted some socks.</p><p>Now, that outcome isn't guaranteed even if Bob and Alice start about the same time. On the way to inserting her first coin Alice could be waylaid by the boss looking for his TPS report. As Alice patiently explains that a) TPS reports were never her job and b) they were discontinued three years ago and c) her eyes are on her face not her chest, Bob could have merrily taken the two coin trips and safely received a tube of socks without ever knowing the mortal injury he narrowly avoided.</p><h3>Finally, Some Damn Code</h3><p>Any time something unwanted can happen as the result of unpredictable delays, scheduler priorities, workload, etc you have a race condition. What could be more unwanted than being crushed by a vending machine? And what could be more unpredictable than a pointy haired boss? We can write this up exactly in Erlang.</p><p>In a file called sockmachine.erl. First, a little standard module definition and export business.</p><pre> -module(sockmachine). -export([start/0, insertcoin/1, pushbutton/1, test/2]).</pre><p>Here are the guts of the machine. zerocoins(), onecoin(), and twocoins() are the states of the machine. When one is called it blocks, waiting for an message in its inbox. Based on the message it gets it responds with {nothing} if nothing happens, {coin} if it needs to return a coin, or {tubeofsocks} for the win. It also then calls the appropriate function for the next state - which might be the same state. These are all private functions not exported by the module. Note, there are more clever ways to write this - but for explanatory purposes I like this.</p><pre>zerocoins() -&gt; receive {coin, From} -&gt; From ! {nothing}, onecoin(); {button, From} -&gt; From ! {nothing}, zerocoins() end. onecoin() -&gt; receive {coin, From} -&gt; From ! {nothing}, twocoins(); {button, From} -&gt; From ! {nothing}, onecoin() end. twocoins() -&gt; receive {coin, From} -&gt; From ! {coin}, twocoins(); {button, From} -&gt; From ! {tubeofsocks}, zerocoins() end. </pre><p>Start spawns a new sock machine actor in the zerocoins state</p><pre>start() -&gt; spawn(fun() -&gt; zerocoins() end).</pre><p>insertcoin and pushbutton are rpc style convenience functions that insert a coin or push the button. Or did I get that backwards? Well, whichever, they each return whatever they recieve as a message back from the machine.</p><pre>insertcoin(Machine) -&gt; Machine ! {coin, self()}, receive X -&gt; X end. pushbutton(Machine) -&gt; Machine ! {button, self()}, receive X -&gt; X end.</pre><p>Test spawns as many concurrent test loops as requested to simultaneously pound one machine.</p><pre>test(Machine, Processes) -&gt; if Processes &gt; 0 -&gt; spawn(fun() -&gt; testloop(Machine, 100) end), test(Machine, Processes - 1); true -&gt; io:format("all test processes launched~n") end. </pre><p>Testloop repeatedly walks through the cycle of inserting 2 coins and pushing the button. It calls helper functions that mirror the state of the sock machine to show what it expects to happen at each step, complaining when things don't go well.</p> <pre>testloop(Process, Machine, Count) -&gt; if Count &gt; 0 -&gt; testzerocoins(Process, Machine,Count); true -&gt; io:format("[~w] testing completed~n", [Process]) end. testzerocoins(Process, Machine, Count) -&gt; case insertcoin(Machine) of {nothing} -&gt; testonecoin(Process, Machine,Count); {coin} -&gt; io:format("[~w] BONUS COIN!~n", [Process]), testtwocoins(Process, Machine,Count) end. testonecoin(Process, Machine, Count) -&gt; case insertcoin(Machine) of {nothing} -&gt; testtwocoins(Process, Machine,Count); {coin} -&gt; io:format("[~w] BONUS COIN!~n", [Process]), testtwocoins(Process, Machine,Count) end. testtwocoins(Process, Machine, Count) -&gt; case pushbutton(Machine) of {tubeofsocks} -&gt; io:format("[~w] Got my socks.~n", [Process]); {nothing} -&gt; io:format("[~w] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH~n", [Process]) end, testloop(Process, Machine, Count - 1). </pre><p>Now fire up erl, compile, start a machine, and test it with only 1 running test loop</p><pre>1&gt; c(sockmachine). {ok,sockmachine} 2&gt; Machine = sockmachine:start(). <0.38.0&gt; 3&gt; sockmachine:test(Machine,1). all test processes launched ok [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] Got my socks. [1] testing completed </pre><p>Ah, sweet, sweet success! But now run another test with 2 concurrent test loops. 1 = Bob, 2 = Alice...or was that the other way around?.</p><pre> 4&gt; sockmachine:test(Machine,2). all test processes launched [2] BONUS COIN! [1] BONUS COIN! ok [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] BONUS COIN! [1] BONUS COIN! [2] Got my socks. [1] Blasted machine ate my money! Give it to me! Rattle, rattle, ARRRGGHGHHRHRHGH [2] testing completed [1] testing completed </pre><p>It's a litany of socks, bonus coins and crushed intestines. On my machine it's an oddly predictable litany, but in a real, distributed Erlang app it would be much more interesting and random litany as network delays would emulate pointy haired bosses even better than Erlang's scheduler.</p><h3>Some Last Notes</h3><p>With Erlang style programming, actors are the central unit of statefulness. Multiple actors can share access to one stateful actor. Hence shared state, race conditions, and ruptured spleens. Am I saying that Erlang or actors are bad? No, in fact I quite like them. What the Erlang model does very nicely is separate that which must be stateful because it is concurrent from that which is more pure computation. By making state so much more painful to write than "foo.x = foo.x + 1" the actor model encourages you to think about the consequences of sharing it. It also cleanly mirrors the mechanics of distributed computing and asynchronous IO. It's nice, but it's not stateless.</p><p>One last note. I started with "actors are all about shared state." Naturally one might ask "well, what about stateless actors - actors that don't change state or depend on state via IO?" Certainly those are viable uses of actors. But that's no longer concurrency, that's parallelism and IMHO futures, data flow variables, and Haskell's data parallelism are all cleaner ways to deal with parallelism. Someday soon I hope to write about them. In the meantime, the whole point of having the complexity of message passing instead of those simpler mechanisms is precisely to deal with the complexity of concurrent state.</p><p>One really last note. Sadly, simple straight up actors don't automatically compose very well. You can design a set of actors that interact correctly for one use case but that don't interact at all well when plugged into a different system. This is another aspect that actors share with traditional manual locks and shared mutable memory. To date the best known way to deal with composable state is transactions (optimistic, pessimistic, distributed, compensating, software, hardware, database, whatever). There are very nice transactional capabilities available for Erlang, but this is yet another area where the "no shared state" mantra can lead people to think that actors are the entire answer without needing anything else.</p><p>Try not to get crushed and good luck with your socks!</p><h3>Postscript</h3><p>It has been suggested in the comments that when people say that Erlang style actors don't share state they mean it doesn't share memory. First, I clearly defined state in the previous article as being different from its representation. But, just as importantly, I think that saying "we don't share memory" is a distinction without much relevance. It's mostly an implementor's point of view that doesn't reflect how a user must think about actors.</p><p>Here's some Erlang code for simple shared mutable variables. This isn't recommended Erlang usage, but it doesn't break the model in any way.</p><pre>-module(variable). -export([malloc/0, fetch/1, store/2]). malloc() -&gt; spawn(fun() -&gt; loop(0) end). loop(Value) -&gt; receive {fetch, From} -&gt; From ! Value, loop(Value); {store, NewValue} -&gt; loop(NewValue) end. fetch(Variable) -&gt; Variable ! {fetch, self()}, receive X -&gt; X end. store(Variable, Value) -&gt; Variable ! {store, Value}.</pre><p>And here's me using it.</p><p><pre>1&gt; c(variable). {ok,variable} 2&gt; X = variable:malloc(). &lt;0.38.0&gt; 3&gt; variable:store(X,42). {store,42} 4&gt; variable:fetch(X). 42 5&gt; variable:store(X, variable:fetch(X) + 1). {store,43} 6&gt; variable:fetch(X). 43</pre></p><p>And, since these variables are actors they are just as easy to share as my sock machine example.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-3065882543275531333Fri, 17 Apr 2009 07:29:16 GMTBut, But...You Didn't Mutate Any Variableshttp://james-iry.blogspot.com/2009/04/but-butyou-didnt-mutate-any-variables.html<div class="post-body entry-content"> <p>In a private email, somebody asked of my <a href="http://james-iry.blogspot.com/2009/04/erlang-style-actors-are-all-about.html">previous article</a> "Okay, I can see that there must be state since you've created a state machine. But you aren't mutating any variables so where does the state come from?" It's a good question. If you feel you "got it" then skip this article. But if you're still hesitant or think I'm misusing the word "state" please read on.</p><p>First, let me remind that I contend that state is 1) something responding to the same inputs differently over time as far as another observer is concerned and 2) an abstraction and not the representation. None-the-less, state always ends up with some representation. For instance, files can represent state. And even users of programs can hold state (quick, I'm in a drunken state and I plan to write a long rant, who am I?).</p><p>I want to ignore all but memory based state and show that, ultimately, even the Erlang code must manipulate mutable memory and that message sending effective leaks a little bit of that detail. Now, on Reddit an offended Erlanger called me a Java fanboy. As any of my loyal readers know, it is true that I think Java is the most incredibly awesome language EVER and I can hardly stop writing glowing things about it. But for this article I'll struggle against my natural instincts and use Ruby, Haskell, assembly, and Erlang. </p><p>I'll use Ruby to illustrate some points about state and representation. First the tired old state diagram and then some Ruby</p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s1600-h/path2383.png"><img style="cursor:pointer; cursor:hand;width: 400px; height: 153px;" src="http://2.bp.blogspot.com/_Nyksh2OFfkU/SdaexYpMWgI/AAAAAAAAAGs/36LoPVL5kZY/s400/path2383.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5320614581050825218" /></a><pre>class SockMachineSimple def initialize @state = :zerocoins end def insertcoin case @state when :zerocoins @state = :onecoin :nothing when :onecoin @state = :twocoins :nothing when :twocoins :coin end end def pushbutton if @state == :twocoins @state = :zerocoins :tubeofsocks else :nothing end end end</pre><p>If you don't read Ruby then Ruby control flow structures generally result in the value of their last contained expression so explicit returns aren't needed. :foo is a symbol (kinda like a string). @state declares a field (which is private) named state. I called it state because it so clearly matches to the state space specified by the diagram. But here's another way to write SockMachine with a completely different representation</p><pre>class SockMachineComplex def initialize @count = 0 end def insertcoin if @count % 3 == 2 :coin else @count = @count + 1 :nothing end end def pushbutton if @count % 3 == 2 @count = @count + 1 :tubeofsocks else :nothing end end end </pre><p>Instances of this class keep track of the total number of coins or button pushes ever accepted and use the modulus operator to decide what to do about it. The representation is quite different from that of SockMachineSimple, but to anybody using this class the difference is mostly irrelevant - it still conforms to the same state diagram. Internally it has a very large number of states but externally it still has the same three. And here's one last stateful machine</p><pre>class SockMachineDelegated def initialize @machine = SockMachineComplex.new end def insertcoin @machine.insertcoin end def pushbutton @machine.pushbutton end end </pre><p>By looking at the source it would appear that this version never mutate any variables after being initialized, but of course it does by calling methods on SockMachineComplex. Hiding the manipulation of representation is not the same thing as making something stateless. And now one last example, one that is totally different because it is not stateful.</p><pre>class SockMachineStateless def pushbutton machine = SockMachineComplex.new machine.insertcoin machine.insertcoin machine.pushbutton end end</pre><p>Internally, pushbutton uses a stateful machine but externally it is not stateful. Short of using a debugger you can't observe the state changes in SockMachineStateless's internal machine. Actually, I guess you could monkey patch SockMachineComplex to do some kind of IO or callback to expose the workings so maybe I shouldn't have used Ruby to illustrate my OO points. Too late now. </p><h3>Hiding State</h3><p>Okay, but that's OO stuff. Erlang is a functional language and the code never appeared to have any mutable variables or nested mutable objects. So what gives? Well, to illustrate Erlang I'm going to use Haskell, another functional language. Haskell's great advantage to this exposition is that it's really, really easy to get at some nice clean assembly.</p><p>So here's a very silly Haskell function for determining if a positive Int is even or odd. Its implementation is silly, but it allows me to make a point.</p><pre>flurb :: Int -&gt; String flurb n = if n == 0 then "even" else if n == 1 then "odd" else flurb $ n - 2</pre><p>This is a pure function - it mutates nothing. Yet when compiled with ghc -S -O to get AT&T; syntax assembly, the function looks like this</p><pre>Main_zdwflurb_info: movl (%ebp),%eax cmpl $1,%eax jl .Lczq cmpl $1,%eax jne .Lczo movl $rxA_closure,%esi addl $4,%ebp andl $-4,%esi jmp *(%esi) .Lczo: addl $-2,%eax movl %eax,(%ebp) jmp Main_zdwflurb_info .Lczq: testl %eax,%eax jne .Lczo movl $rxy_closure,%esi addl $4,%ebp andl $-4,%esi jmp *(%esi)</pre><p>GHC has compiled the flurb function down to a loop and the immutable n variable is represented with the mutable eax register [1]. Some pseudo Haskell/assembly mix might help illustrate</p><pre>flurb n = movl n, eax -- copy n into the eax register Main_zdwflurb_info: if eax == 0 then "even" else if eax == 1 then "odd" else addl $-2, eax -- decrement eax by 2 jmp Main_zdwflurb_info -- jump to the top of the loop</pre><p>As you can see, the Haskell code does use mutable "variables" at runtime. This is a common optimization technique in functional languages for dealing with direct tail recursion. But all that machinery is hidden from you as a programmer so just like SockMachineStateless the end result is stateless unless you peel back the covers with a debugger.</p><h3>Finally, Some Damn Erlang</h3><p>All right, I've written Ruby and Haskell, generated some assembly, and then written in an impossible chimeric language. But my original question was about Erlang. Here's a simple Erlang counter actor</p><pre>-module(counter). -export([create/0, increment/1]). create() -&gt; spawn(fun() -&gt; loop(0) end). increment(I) -&gt; I ! self(), receive X -&gt; X end. loop(N) -&gt; receive From -&gt; From ! N, loop(N + 1) end.</pre><p>Again, no variables are mutated. But if I assume that Erlang does the same kind of direct tail call optimization that Haskell does, the pseudo Erlang/Assembly loop looks like this</p><pre>loop(N) -&gt; movl N, eax % copy n into the eax register .looptop: receive From -&gt; From ! eax, % send the current value in eax back to the original sender inc eax % increment eax by 1 jmp .looptop % jump back to the top of the loop end.</pre><p>It's still a loop like the Haskell one, but there's an important difference. Each message receive sends back the current value of the mutable eax register then mutates it. So this behaves a bit like SockMachineDelegated - the original code didn't appear directly manipulate any mutable variables, but there was mutation under the covers and unlike SockMachineStateless but like SockMachineDelegated this change of state is visible beyond the local confines. [2]</p><p>Now, there are other ways to deal with recursion. I don't know Erlang's implementation, but it doesn't matter. Something is being mutated somewhere and that change is being made visible by messages being sent back. Typically non tail calls mutate the stack pointer so that it points to new stack frames that hold the current value of arguments, or then again some arguments might stay in registers. Tail calls that aren't direct tail recursion might mutate the stack region pointed to by an unchanging stack pointer. Whatever, it always involves mutation, and it's always hidden from the programmer when using pure functions. But actor loops are not pure functions. The sends and receives are side effects that can allow a little tiny bit of the mutable machine representation to be mapped to state that an observer can witness.</p><h3>But Really Where Are The Mutable Variables?</h3><p>So all that's fine, but it doesn't really show where the state is in my original Erlang code. The code didn't have an N to stick in a mutable register or a mutable stack frame. Here's the core of it.</p><pre>zerocoins() -&gt; receive {coin, From} -&gt; From ! nothing, onecoin(); {button, From} -&gt; From ! nothing, zerocoins() end. onecoin() -&gt; receive {coin, From} -&gt; From ! nothing, twocoins(); {button, From} -&gt; From ! nothing, onecoin() end. twocoins() -&gt; receive {coin, From} -&gt; From ! coin, twocoins(); {button, From} -&gt; From ! tubeofsocks, zerocoins() end.</pre><p>For this final answer I'm afraid I have to do a bit of hand waving. The explanation is even more abstract than the mutable variable as register/stack local explanation. You see, no matter what, your code always mutates one register: the instruction pointer. Its job is to point to the next instruction to be executed. As the CPU executes instructions the IP moves around, typically just being bumped up a bit for everything but jumps which move it more substantially. </p><p>In purely functional code the IP moves around but these moves can't be observed by other parts of the program as they are happening. In other words, in purely functional code, the changing IP is invisible. It's a completely black box unless you use a low level debugger. It's very much like SockMachineStateless where the internals were mutable but invisible to the outside world.</p><p>But with message receives the IP can be induced to move around based on things communicated from outside the function. The IPs current instruction can then define in large part what a message received by a process will do. If the IP points at the receive in the "zerocoins()" function then that's one behavior and if it it points to the receive in the "twocoins()" function then that's another. The different behaviors can be observed by other actors via messages sent back to them. When an actor sends a SockMachine a coin or buttonpress message it may indirectly be causing the IP to be mutated. And when it gets back nothing, coin, or tubeofsocks it's really being told, in a very indirect way, something about the value of the IP.</p><h3>Conclusion</h3><p>State is not the same thing as representation. None-the-less, with an omniscient view of things you can always find the representation of state. It might be in bits stored on a drive, it might be in neurons in a user's head, or it might be in a computer's memory and registers. The representation might not be directly visible in the source you're looking at but hidden in low level machinery. It might just be the instruction pointer pointing at different parts of your code base. </p><p>If you equate state with representation you end up with the very un-useful property that everything is stateful since at some level of representation every bit of executing code on your computer involves stateful processes. This view robs you of the ability to think about high level languages, especially purely functional ones like Haskell, at an appropriate level of abstraction.[3]</p><p>Purely functional code ensures that that the state represented by registers and stack pointers and instruction pointers is kept local. But side effects like message sends and receives allow that low level machinery to be used as the representation of shared mutable state even if you don't see it in your code.</p><p>In the end, it's far more useful in most cases to think of state from the point of view of an observer than the point of view of its implementation. The SockMachine actor was stateful regardless of how that state was represented simply because other actors could observe it changing state. Digging down into how send and receive allow a modicum of access to the instruction pointer just isn't a useful model for thinking about stateful actors normally. So the short answer to the original question is "Who cares how the state was represented? It was shared mutable state."</p><h4>Foot notes</h4><ol><li> Actually, it appears to be moving n back and forth between memory pointed to by eap and eax, which seems oddly wasteful given that it never branches out of this function. It also does 3 tests when only 2 should be necessary.</li><li>Technically the Erlang code probably can't keep the current value of N in a register when calling receive since receive may cause a task switch to another process, so assume that receive copies registers out to memory before executing then swaps them back in when its done.</li><li>Some people equate state with data, but that's just as bad. Code is data so all code must be stateful - another useless conclusion.</li></ol> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-8333808293262753604Fri, 17 Apr 2009 07:28:40 GMTThe Door to User Interface Designhttp://james-iry.blogspot.com/2009/04/door-to-user-interface-design.html<div class="post-body entry-content"> <p>A common tenet in user interface design is to be consistent, especially to be consistent with de facto standards. It's certainly a good goal, but I think that kind of consistency can be pushed too far. I recently thought of some real world examples: humble, ordinary, every day doors.</p><p>Doors share a common functional requirement: be closed sometimes and be open others. That leaves a lot of room for design though, like what color should it be? Or, more importantly, which way should it open?</p><h3>The De Facto Standard</h3><p>Doors in offices, bedrooms, bathrooms, etc. tend to open inwards. Why? Because a bathroom door opening outwards could suddenly thrust a large solid object into a public walkway. Black eyes, broken noses, and other hilarity ensue.</p><h3>The Exception</h3><p>There's a major exception to this rule. Public places where there are likely to be crowds like stores tend to have doors that open outwards. Why? Because a dangerous fire is far more likely to occur inside than outside, and if one did occur then a rushing crowd could jam an inward opening door closed. Panic, trampling, and other hilarity ensue.</p><h3>The Exception to the Exception</h3><p>There's a major exception to the exception to the rule. An airliner is certainly a crowded public place where you would want to quickly get people out in an emergency. Yet the door opens inwards. Why? Because at altitude the airplane will be pressurized. The inside pressure will be significantly higher than the outside and must be contained. The doorway is a significant weakening in that containment, but a door that opens inwards works just like a drain plug. The pressure difference keeps it firmly sealed and in place. A door that opened outward would be very likely to, um, open outward. Ejected flight attendants, hypoxia, and other hilarity ensue.</p><h3>The Take Away</h3><p>De facto standards are good and often have good reason to exist. But they don't always apply. Exceptions and exceptions to exceptions sometimes make more sense. Following a standard is no substitute for doing actual design.</p> <div style="clear: both;"></div> </div>James Iry (noreply@blogger.com)tag:blogger.com,1999:blog-178174920347765771.post-5091392080568383923Wed, 01 Apr 2009 15:23:37 GMT