Recent Updates Toggle Comment Threads | Keyboard Shortcuts

  • Matthewfl 6:39 pm on 29 November, 2016 Permalink | Reply  

    Redmagic Meta-tracing JIT 

    https://github.com/matthewfl/redmagic

    This is a newly published repository from an experiment that I started during the summer to attempt to optimize Python’s runtime without impacting compatibility with C models and existing libraries.

    There currently exists frameworks such as PyPy and Truffle/Graal which will generate a JIT when you design your programming language inside of their framework and use their compiler to generate a custom binary which implements a JIT for a given language.  One problem with these frameworks is that they require that an existing language be reimplemented essentially from scratch using a subset of Python or Java respectively.  Additionally, once a programming language is reimplemented, any existing modules which interfaces with internal aspects of the interpreter (any python C module) will not be compatible and will have to be rewritten.

    Redmagic is similar to PyPy and Truffle/Graal in that it tries to be a framework for creating a JIT however it is different in that it tries to work with an existing C or C++ interpreter requiring only a few annotations inside the code to identify loops at the user level of the program. (cPython example)  It then generates traces of a user’s program by switching between letting the language’s interpreter run normally and on top of a virtual machine which records all instructions and branching directions.  Unlike other JITs it does not reorganize how memory is laid out and even goes as far as simulating pushing and popping of C level stack frames.  This means that at any point while running inside the virtual machine or inside the generated code, the system can resume the “normal” code simply by jumping to the corresponding instruction in the original interpreter.  This does come with the downside that it inherits all memory layout and pointer dereferencing performance issues that were present in the original interpreter.

    Results: After much work on Redmagic, I have observed very slight (<1%) performance improvements in limited cases when working with cPython.  The issues of memory layouts (explored further in this post) seem to contribute significantly to Python’s performance issues and those are not addressed at all by the Redmagic implementation.  Additionally, it is not clear that these would be easy to address from Redmagic itself given that it is looking at a program from the level of x86 assembly instructions vs higher level representations where data-structures can be easily recovered.  I believe that there might be cases, possibly in math routines, where memory layouts have already been optimized and possibly allowing for more clever uses of branching could prove beneficial to performance.

     
  • Matthewfl 6:46 am on 15 November, 2016 Permalink | Reply  

    Cost of Abstractions 

    This post is from a recent conversation that I was having about the cost of abstractions in various languages. In choosing a programming language for a project, it is important to choose a language which gives a good mix of “time to develop” and “performance.” The first tends to be obvious from experience, however, for many the performance of a programming language (and there implementations) is somewhat opaque. I am going to demonstrate the differences between Python, Java, and C++ as this gives a good sample of the differences between programming implementations. We can think of Python as representative of the general class of interpreted base programming languages such as Ruby and Perl and Javascript before 2005. Java represents the current state of the art in JIT technology and should generalize to general JVM targeted languages and even languages such as Javascript which have seen countless hours invested developing new implementations of the language. Finally, C++ is a “low level” language, however it features a number of unique features, such as templates and stack allocation, which allow for zero cost abstractions which is sparsely seen in other languages.

    What is abstraction

    Abstraction in this post means the ability of the programmer to wrap a series of methods and data into some object which we are going to use throughout our program. An example of abstraction might be a class LogNumber which abstracts away the fact that we are representing some numbers inside of a log space instead of a linear space. This is fairly standard practice in machine learning applications and might even be necessary in some cases where the magnitude of number is smaller then the range of a floating point number. If we have this abstraction then we can write an expression such as: LogNumber + LinearNumber and have the LinearNumber automatically converted into log space and thus we will get the correct behavior from our program. If we are programming without this abstraction and simply using primitive types, and we accidentally write float_in_log + float_in_linear we are going to get a number that is neither in log space or linear space and this is a bug. One of the most famous examples of this bug is with the Mars Climate Orbiter crashed due to a units mismatch between standard and metric system.

    For the remainder of this post, we consider a simplified “LogNumber” class, A which is using a single 32 bit signed integer and getter and setter methods. One could easily imagine there being multiple getter and setter methods for when one wants to set a linear number to a log space number etc.

    # Python
    class A:
      def __init__(self, a):
        self._value = a
      def get(self):
        return self._value
      def set(self, a):
        self._value = a
    
    // Java
    class A {
      private int value;
      public A(int a) { value = a; }
      public int get() { return value; }
      public void set(int a) { value = a; }
    }
    
    // C++
    class A {
    private:
      int32_t value;
    public:
      A(int32_t a) : value(a) {}
      int32_t get() { return value; }
      void set(int32_t a) { value = a; }
    }
    

    Memory overhead

    First, lets consider the impact on memory that this abstraction has in each of these languages. Memory makes since as a place to start it is generally understood that the less memory that we use the more that we can fit into ram. Additionally, on modern hardware, accessing main memory tend to be a major bottle neck for performance with a read from main memory taking ~100 cycles while a local processor caches takes 10-20 cycles. As such, having an object which is twice as large means that we can fit half as many objects inside our processor cache and thus will end up performing costly accesses to main memory more often. Remember, while reading this next section, we are just wrapping a single 4 byte integer.

    # Python
    a_python = A(5)
    
    // Java
    A a_java = new A(5);
    
    // C++
    A *a_ptr_cpp = new A(5);
    
    A a_stack_cpp(5);
    

    Python: First for Python, in constructing the A class, we are going first to create a PyInstanceObject which contains 3 pointers plus the PyObject_HEAD which is 3 pointers plus a size_t bringing the size of this struct up to 56 bytes plus malloc overhead. (Note: the malloc overhead depends on the implementation of malloc used and could even be changed at runtime using something like LD_PRELOAD). Next, to actually save _value we have to construct the hash map that is going to back all the elements for this class instance. This means creating a PyDictObject which is a PyObject_HEAD, 3 size_t, 2 pointers, and 8 preallocated hash map slots for which each slot contains a size_t and 2 pointers bringing up the size of this object to 264 + malloc overhead. The number 5 in this case is a small integer and thus is already interned inside the python interpreter and the hash map key _value would be shared between all instances of this object and so we are not going to count it. Finally, we are going to have to maintain a pointer on Python’s stack to the object a_python that we just constructed. All together then this brings our total size to 328 + 2 * malloc overhead. Note: We could reduce the object size somewhat by using Python’s __slots__ feature which should avoid allocating an oversized hash map to back this object, however our object size is still going to be in the 100s of byte range.

    Java: Next with Java, there is an 8 byte header that is added to all objects which contains information about the type of object and object state w.r.t. locking etc.  (more info) Additionally, Java will round the size of objects up to the next 8 byte increment which means that there will be 4 wasted bytes inside this particular object. Finally, we need to store a pointer to this object on our stack. One of the nice tricks in Java for saving memory is pointer compression where a pointer will be 4 bytes instead of 8 which is a significant savings given that Java programs tend to use a lot of pointers.  (pointer compression info). Together, this means that there will be 20 bytes in memory corresponding to this object.

    C++: In the first case with a_ptr_cpp we can see that the object size will be 4 bytes as one would expect given the C++ standard. There will also be the additional malloc overhead associated with this object. This brings the total size for the a_ptr example to 12 + malloc overhead when including the stack pointer.
    In the second case with a_stack_cpp we have this being directly allocated on the stack which means that we have no pointer or malloc overhead or pointer to the object, thus the total size is only 4 bytes. The fact that we able to take a primitive type (int32_t) and wrap it inside some class and still consume the exact same amount of memory is what it means to have a zero cost abstraction.

    Registers

    Once C++ is compiled with some optimizations or a Java program has been running for long enough, it is feasible that both a_ptr_cpp and a_java are entirely stored inside a register saving 8 and 4 bytes respectively. Moreover, in the a_stack_cpp case we might have the integer value stored in a register, which means that the memory used in this case is 0 bytes.

    Escape Analysis

    For the C++ and Java case if a compiler can prove during optimizations that a dynamically allocated structure is not going to leave a method, then is is possible for that compiler to transform that object into a stack allocated object. This means that even for a_java and a_ptr_cpp could end up having zero memory consumption in the right circumstances.

    // Java
    {
      A a_java = new A(5);
      a_java.set(6);
      System.out.print(a_java.get());
      // we never let anything outside this block get the reference to a_java
    }
    
    // C++
    {
      A *a_ptr_cpp = new A(5);
      a_ptr_cpp->set(6);
      cout << a_ptr_cpp->get();
      delete a_ptr_cpp;
      // the compiler is able to see where this object is constructed and destroyed
      // if we didn't delete it, then the fact the leak could still be considered an escape in C++
    }
    

    Array overhead

    Arrays are basic primitive in nearly all languages when it comes to efficiently storing a number of similar objects. In the case of numerical processing applications, our main performance overheads are going to come down to how efficiently we store numbers and are able to access these values.

    # Python
    arr_python = [A(i) for i in range(10)]
    
    // Java
    A[] arr_java = new A[10];
    for(int i = 0; i < arr_java.length; i++)
      arr_java[i] = new A(i);
    
    // C++
    A *arr_ptr_cpp[] = new A*[10];
    for(int i = 0; i < 10; i++)
      arr_ptr_cpp[i] = new A(i);
    
    A arr_obj_cpp[] = new A[10];
    for(int i = 0; i < 10; i++)
      arr_obj_cpp[i] = A(i);
    
    std::vector<A> vec_obj_cpp(10);
    for(int i = 0; i < vec_obj_cpp.size(); i++)
      vec_obj_cpp[i] = A(i);
    
    A arr_stack_cpp[10];
    for(int i = 0; < 10; i++)
      arr_stack_cpp[10] = A(i);
    

    Python: First for python, when constructing a list we create a PyListObject which contains the PyObject_HEAD, a pointer to the head of the list and a size_t for the size of the list for 48 bytes. The list itself is simply a list of pointers to objects. Therefore the list object on its own will be 128 + 2 * malloc overhead. For every object, we are then going to have the same as we had above with 348 per object. This brings the total of this to 3608 bytes + 22 * malloc overhead.

    Java: With Java, there are essentially two types of arrays. The first is an array of a primitive type (such as int[]) where the value of the integers will actually be stored inside the array itself. These arrays can be considered very efficient since there is a small overhead when storing a large number of primitive values. However, in this case, our array does not contain primitive types and instead must has to be an array of pointers. This means that the array itself will have the 12 byte overhead for each object created and then 10 pointers for which each is 4 bytes, making the list itself 48 bytes which is divisible by 8 so we do not have any additional wasted space here. Each object in the list will then be 16 bytes as before and we have a reference on the stack bringing the total for this up to 212 bytes.

    C++: Finally with C++, we can see that there exist a number of possible implementations for an array of objects for which each of these have slightly different memory implications. In the arr_ptr_cpp case we are creating an array of pointers which is conceptually equivalent to the Python and Java implementations. Including the stack reference to this, it takes 120 + 11 * malloc overheads bytes in memory. Note: we are are going to have to explicitly deallocate all objects inside the array when freeing arr_obj_cpp which requires additional cognitive overhead when writing this program. In the second C++ example arr_obj_cpp we are allocating space inside the array for our object instead of using a pointer to reference the array. This means that the size of the array will only be 40 bytes bringing this implementations memory consumption to 48 + 1 * malloc overhead when including the stack pointer. The third case with vec_obj_cpp uses the C++ standard library and would be a more acceptable way of writing the arr_obj_cpp example and will use 2 additional size_t (16 bytes) to track the array’s size and amount allocated (64 + 1 * malloc overhead bytes). The final case is constructing an array directly on C++’s stack which means that we are avoiding the stack pointer reference as well as the malloc overhead and this will only require 40 bytes.
    Again, we can see that C++ is capable of a zero overhead abstraction as in the last 3 examples the only overhead from directly using int32_t was O(1) book keeping.

    Performance overhead

    While the way in which data is stored is important, we also care about how these languages are going to perform when we go to execute our program.

    # Python
    a_python = A(5)
    a_python.set(6)
    print(a_python.get())
    
    // Java
    A a_java = new A(5);
    a_java.set(5);
    System.out.print(a_java.get());
    
    // C++
    A *a_ptr_cpp = new A(5);
    a_ptr_cpp->set(6);
    cout << a_ptr_cpp->get();
    
    A a_stack_cpp(5);
    a_stack_cpp.set(6);
    cout << a_stack_cpp.get();
    

    Python: First, Python is the simplest language in the comparison, given that it is a simple bytecode interpreter. This means that it will not be performing any static compile time optimizations or attempting to compile away this abstraction.
    Looking at just the a.set(6) line, we can easily look at the Python’s bytecode using the dis library:

    # Calling a.set
    3 12 LOAD_FAST 0 (a)
    15 LOAD_ATTR 1 (set)
    18 LOAD_CONST 2 (6)
    21 CALL_FUNCTION 1
    24 POP_TOP
    
    # Method a.set implementation
    7 0 LOAD_FAST 1 (a)
    3 LOAD_FAST 0 (self)
    6 STORE_ATTR 0 (_value)
    9 LOAD_CONST 0 (None)
    12 RETURN_VALUE
    

    We can easily see that there are 10 python bytecode instructions. Some of these instructions such as STORE_ATTR and LOAD_ATTR represent hash table lookups to identify the _value and set slots. Additionally, nearly all of these instructions represent writes into memory given that Python uses reference counting, and by loading objects onto Python’s stack requires incrementing the reference count. To see full implementations of these byte codes you can look at ceval.c. From prior experience with python’s internals I would guess that this line represents about 2000-7000 CPU instructions with over 30 writes to memory and 100 reads.

    Java: With Java, it is a bit more difficult to exactly measure how much overhead there is when evaluating this instruction. This is because the JIT will be constantly generating new versions of this code which are more efficient. If we were to evaluate this instruction exactly once, then our bytecode will be running on an interpreter which is conceptually similar to python, but by not having reference counting and not backing every object with a hash map means that this is going to be significantly more efficient and about 100-500 instructions. After we have executed this statement a few times, the JIT will generate more optimized code. The exact code that is generated depends on a number of factors including if at the call site for a.set(6) if there exist multiple implementations of set that we must dispatch to (multiple classes implementing the same interface). Assuming that there will only be a single implementation of set, then the JVM will end up inlining and optimizing this down to 1 instruction which write to memory.

    C++: When using a pointer in C++, we can see statically that there is only a single implementation of A::set and so the compiler will likely inline this method call. This means that similar to the Java case it will have 1 instruction to perform the write into memory. In the second case with a_stack_cpp we might represent this instruction as a single write into a register. While both of these cases are going to be a single instructions, the later of writing into a register instead of memory will be much more “efficient.” Additionally, in a larger context, we could imagine that the compiler completely removes the representation of a_stack_cpp and just inlines the value 6 or 5 whatever is appropriate.

    Again, a_stack_cpp is giving a zero cost abstraction while a_ptr_cpp and a_java are giving us a low cost abstraction w.r.t. the number of instructions evaluated.

    Meta template programming

    The advantages of zero cost overheads really shine when combined with templates and meta programming techniques. By having templated classes, we are able to construct a single class which implements many basic features that we are able to reuse.  For example, suppose that we had a LogNumber class such that we could have LogNumber<LinearNumber<float> > and LogNumber<LinearNumber<double> > which use the same same which class implementation with two different sizes of numbers. We could even extend this to have LogNumber<LogNumber<LinearNumber<float> > > which would allow for us to easily represent a number in a log log space without having to reimplemented our LogNumber class.

    If we implemented this templated class in Java, then we would have that every template parameter would require a pointer and another class instances. This means that in the LogNumber<LogNumber<LinearNumber<Float> > > we would require that there are 4 class instances and thus would require 4 pointer indirection to actually reach the floating point value and consume 64 bytes of memory.

    In C++, this templated class does not use a pointer indirection when using a template. This means that the size of LogNumber<LogNumber<LinearNumber<float> > > and LinearNumber<float> will be exactly the same size. Additionally, accessing the stored number will only take a single memory access instruction and can be placed on the stack or used inside an array as shown above.

    Conclusion

    Abstraction is a very important concept when writing larger programs. Having zero cost abstraction available means that we can have both performant programs while reducing the mental burden on the on the programmer. When programming in a language such as Java or Python, we have to make a choice between developing the most performant program in that language and having a clean and reusable abstraction.

    We see the power of zero cost abstractions aggressively used inside libraries such as Eigen, Dlib and Tensorflow as these libraries care immensely about the computational and storage efficiency. Using C++’s template system, these libraries contain a single implementation for a generic routine which can be customized by switching out a few minor methods without losing any performance.

     
  • Matthewfl 9:45 am on 9 November, 2016 Permalink | Reply  

    WTF happened this year 

    This is a post that I started writing shortly after the Democratic convention, as I started thinking that Trump was going to end up beating Clinton.    Now that the “impossible” has happened (haha all those “old media” predicting <2% for Trump) I find myself publishing this post as an attempted reflection on how we got here.

    So what went wrong and what can be learned from this year? I think that the first and easiest way to frame this year might be “old establishment” vs “random unknowns” where a large number of individuals decided that unknowns would be better than more of the same for themselves. These individuals IMO tend to view the status quo as getting themselves screwed over by some external force, such as technological progress, trade deals, banks or immigrants, and as such wanted some candidate that would end what ever entity was screwing them over. When this battle comes down to between HRC and the Orange, we have one candidate that was essentially saying that things are not so bad and that what happened with the financial meltdown and bailing out the banks had to happen, and the other simply channeling people’s anger towards an undeserving group of people (immigrants). In the end, speaking to one’s frustrations rather than trying to tell them they are “crazy” and they are better off then X years ago was the better strategy (who would have guessed).

    While it is easy to forget where we came from in terms of the primaries, considering that I starting this posts months before it was posted, it is easy to recall the events of a few weeks ago.

    First, when the Tangerine said that he might consider a third party run if he was not treated fairly by the Republicans, this was an extremely smart home in hindsight. This was basically his escape hatch from the party which would have allowed the Tangerine to prevent a successful presidential bid if it came to light that there was any foul play during the primary processes. As such, the Republican party was forced in to playing a fair game and thus as an outsider Tangerine was given a fair shot.
    The flip side of this issue was Bernie, who said that he would not consider a third party run which basically meant that he gave license to the Democratic party to sabotage the primary processes against him without their being any consequences (such as directly losing in November as a result). This is something that we know happened given the wide array of emails that have been leaked. (1, 2)  (In the last few weeks alone there have been countless wikileaks which have shown the additional details of how the

    Simply looking at the Democratic primary, we had HRC with Clinton being one of the few name brands bigger than Lewinsky (sorry, bad joke… someone had to make it), and Bernie, a politician who many (at least on the west coast) have never heard about before. The massive HRC name brand politician then proceeded to lose 22 primaries. Additionally, winning the primaries that she did required that she conspired with major media providers and spend years specifically maneuvering to control the Democratic party through DWS becoming the party head (her campaign manager in 2008) and getting countless super delegates to pre-signup with her campaign. When in it came to fund raising, Bernie was consistently out raising HRC and he was doing it using a larger pool of donors making smaller contributions which IMO indicates a campaign which was better in touch with the actual voters. We see a similar parallel when comparing the sizes of HRC and Bernie rallies.

    In directly comparing Clinton to Tangerine and Bernie, we have HRC who continued to “evolve” her position to try and always attract the most voters while Bernie and Trump both took a position and keep pushing a core message.  The fact that Tangerine keep changing exactly what he said on specific policy issues didn’t change the core message which was simple enough that it easily resonated with his core voting base.

    False unity at the DNC
    Primary election fraud
    More info on voter fraud/suppression for Bernie voters
    Large report on election fraud

    TL;DR: If you fix your primary such that you ignore the “public poll” that you are conducing, the candidate that you get out is going to be weaker then they should be in the general election.

     

     

    Some additional comments:

    • While the mainstream media is likely going to frame this as “America wasn’t ready for a woman president,” I don’t think that was the issue.  Instead, Clinton was a weak candidate which to many Americans symbolized the failures of government that they can’t stand
      • The fact that “the most qualified Woman/person ever” just lost to the biggest “joke” we are unlikely to see another “Woman from a major political party” within the next 20 years.  The only chance that the next woman has of getting a nomination is that she wins on a populace surge, the party insiders of the Republicans and Democrats are going to be unwilling to risk it
    • At some level the country has just “approved” the Tangerine’s personal views on race and women (given that this election wasn’t won on policy)
    • Once Clinton pivoted to the general, talks of policy basically stopped.  Instead she started using personal attacks (forcing all those meaningless leaks about personal qualities).  Instead of Bernie had been in the general, he would have keep the message focused on policy, people would have been able to more easily recognize that Tangerine had no real policy and would be unable to deliver.
      • If you want to have a chance of winning against Tangerine in 4 years, you are going to have to defeat him on the fact that he has bad policies or the poor job that he has done.  More personal attacks are not going to work and is a dumb position to take and isn’t going to resonate well with younger generations.
    • The Tangerine tape scandal made no sense.  (Again this isn’t a policy issue attack but a personal attack.)  Some have said that this was a “waking up call” to women that were supporting Tangerine, I don’t think that actually holds that much weight.  Lets suppose for a moment that Tangerine had “much better” policies then HRC w.r.t. women’s issues and this tape still came out, the conversation would have been: “So he said these things, but he is still much better for me as a woman.”  America’s history of presidents has been a string of questionable views on women and marriages etc, one more shouldn’t really be a surprise to anyone regardless of how many times the media plays it.  Most people will never directly interact with the president, “we” (or at least I), do not care if the person who wins the presidency is likable or has done some questionable things in the past, all that “we” care about is whether or not their policies are going to be good for “us.”

     

    What a surprise mainstream media:
    Donald Trump would have lost US election if Bernie Sanders had been the candidate
    How the Washington Post killed Bernie Sanders’ candidacy
    The Democratic Party Establishment Is Finished

     
  • Matthewfl 1:50 pm on 5 November, 2016 Permalink | Reply
    Tags: politics   

    Theoretical online voting system 

    With the election a few days away, I found myself recently looking at the state of voting in America and contemplating that there is still no online based voting system in place. The main arguments against online voting or digital based voting has been that it would be hard to verify, and would require a computer security expert to identify if something has been tampered with.

    Now to create a system which is “provably incorruptible” would be very difficult and impracticable to expect  average poll workers to verify the correctness of such a system, however there is probably a widely unexplored range of systems which are better then our current system but still have some easy to verify properties. In this post I attempt to create a voting system which no worse the our current voting system with respect to voter fraud and ensuring the votes are counted.

    First, lets consider the state of out current voting system, specifically the voting by mail system. Step 1 is to go online and register your address along with some identifying voter information. At a later point in time, the state will mail a ballot to your address which contains a “voting key” which maps various voting positions (who you would like for some office, or position on a proposition) to some number [1, 200]. To vote, you bubble in your corresponding chosen numbers, wrap your ballot in more paper call a “secrecy sleeve” put this in another envelope and mail it into the ballot counting location. Presumably, once your ballot arrives, someone will check the identifying information on the mailing envelope to prevent duplication and then pass the ballot and secrecy sleeve to someone else who is just going to count the votes. This two level operation would prevent people from knowing who you voted for assuming that the first poll works don’t look at the ballot inside the secrecy sleeve. In terms of ensuring that your vote is counted, we have to then trust the second poll worker to count the votes correctly. We might use more then one person for this second part to prevent errors etc.

    Now in making a new system, we have to consider what possible vulnerabilities exist in the current system, as those could still be allowed in the new system:

    1. Trusting the United states postal services (USPS) to properly deliver mail — If your ballot never makes it back to the polling place then it will essentially be lost (there might be some ways to identify that it is lost, but still no real/easy recourse for ensuring that it gets counted)
    2. The USPS needs to get you your ballot to you in the first place — If the ballot was sent to the wrong address, it is possible that someone fills in the ballot in your name, forges your signature and then mails it back in
    3. People are trusted to bubble in their choice correctly — Eg, they are at least able to understand that given some “number” on a “ballot key” that they are suppose to transfer that number correctly to the ballot itself
    4. A malicious poll worker could prevent a vote from getting counted which they didn’t agree with — Given that your vote is easily identifiable on the ballot, it is trivial for someone to reject all ballots which have bubbled in number 10 (ideally there are two or more people to double check that this does not happen)

    Given this set of vulnerabilities in our current system, lets now try to design a better system which allows for internet voting:

    Our first steps would be very similar to the current voting system where someone goes online and registers with their mailing address. The state would then mail out a “ballot key” to the provided address. The reason that we would still require that something is mailed out is that there is currently no good way to identify a citizen online in a secure way, however like the current vote by mail system, it is acceptable to trust the USPS as a “broker of identities.” Now our vote by internet ballot key will be a bit different from existing ballots where each vote is represented by [1, 200] and instead have a number in [0, 2^{256}], additionally, instead of having a single number (say 10) represent a position on the ballot, each voter would be given a unique number for each position on the ballot. (A sample ballot is at the end of this post) We can then use a simple website to collect the keys which represent a person’s choice. Given that each user has different codes generated for their ballot, we can the use untrusted channels to communicate these codes to the vote counting authority. Additionally, we do not have to worry about “suppressing” the vote that a poll worker disagrees with since the intermediate communication mechanisms don’t even know which vote was cast. All they know is that they are responsible for is communicating some number to the voting authority.  Even if voters computer was infected with a computer virus, it would be unable to change your vote since it only knows the key that was enter representing your choice, while the other keys would only be present on the paper ballot key that was mailed to your address.

    Some properties of this system:

    1. We are still trusting the USPS to properly identify people and communicate information with them securely. (Same as before)
    2. Submitting a vote for someone else still depends on your receiving or intercepting their ballot and “forging” a signature (Same as before)
    3. The intermediaries do not know your vote (better than before) — Now your vote is a number that is specific to you, so the only people who will know the vote is the person who generated the “voting key” and whoever has the voting key
      1. The intermediaries can not suppress your vote based off who you voted for — They do not who you voted for so it can not be suppressed based off this reason
      2. Your vote can not be changed after the fact — Changing your vote would require that the malicious intermediary have your “voting key book” which was printed by the state and mailed by the USPS (which is a trusted medium)
      3. Your computer (now technically also an intermediary) can not change your vote even if it was infected with a virus —  your computer does not know the alternate keys you were provided since they were printed and mailed, so it can not switch between them.
    4. The number that you have to enter is a lot longer (worse) — Currently you only enter some number [1, 200], however a 256 bit number is notably longer.  Given how people are already used to entering 16 digit credit card numbers this might not be such a big issue.  We could even include check sums to limit erroneously entering some thing (bitcoin already uses a 32 bit checksum on all addresses)

    Some might point out that one could set up false voting websites to try to confuse voters or perform a DOS attack on the voting website. First, with false websites, we could follow the trend of some banking websites where an images is displayed to ensure that you are on the correct website, however we might make it some confirmation code that is sufficiently long that it would be difficult to counterfeit and easy to print on a “ballot key.” For the DOS attack, we already know how to build systems that can deal with DOS attacks. Additionally, if we have a confirmation code system which confirms that a vote has been recorded, then any mechanism which takes a voting key and returns the confirmation code is as good as any other. This means you could have voting via email or even text message which are “more difficult” to perform a DOS attack against or allow for third-party websites to spring up to collect votes as they would have to still be backed by the state vote recording authority.

     

    Sample theoretical ballot key:

    Politician Office key Confirmation code
    T Sandwich President NMosFJizjPgUV2BKEhGErjvUZzKZVAFCyqPy7w3t6dD FuT8VDz3z
    Giant D President Tru4oZn9y3RMnxAsb7g5Gqs7Fu13FX4ExaQSer6ykeD bFcCf4MJA
    None of the above President LaGeinvoBUduEbovp5zJDQJ6DQEdgSqZWgXzArhiugS xjzEahMdi

    (These politician names are based of this current season of south park.)

     

    TL;DR: Online voting where you are still mailed your ballot via USPS and your ballot contains keys that we consider “secure” and you only submit one key that corresponds to your vote.

     

     

    Update / additional background info / other posts on this topic:

    While the mathematical concepts in these schemes are sound, it would be difficult to convince the public at large.  In these cases, people would have to just generally trust that someone has done the correct thing with designing the voting systems.  From an academic point of view, if these systems are implemented correctly, there wouldn’t even be a need for there to be vote checkers since they would “have to be correct.”

     
  • Matthewfl 1:58 pm on 1 November, 2016 Permalink | Reply
    Tags:   

    JSApp.us Shutdown 

    This post is a few weeks late, however JSApp.us has been shutdown. At the time that JSapp was first released, it was the first open node.js hosting platform. It featured an easy to use web based code editor (which was extremely useful at the time as developing node.js on windows was fairly difficult and required compiling code yourself). The system also provided CNAME support as well as subdomains for quickly developing demonstrating applications. There was even a command line client which allowed for quick deployments from the command and uploading and downloading files from ones file system.

    At last, JSApp’s time has come to an end. It has seen no major updates since 2011 (source) and the state of developing Node.js applications has moved with new API’s (that were not supported by JSAPP) as well as new compile to JavaScript languages (which were also unsupported by JSApp).

    Given the abundance of alternate Node.js hosting options and the age of the system, it seems that essentially all users have already migrated off the platform, so this change is unlikely to been disruptive. The source will remain available on github if anyone interested in some of the internals, however given how dated the system is, I am assuming that there are better solutions today for nearly all aspects of the system.

    jsapp screenshot

     
  • Matthewfl 3:40 pm on 24 March, 2016 Permalink | Reply
    Tags: intel phi   

    Intel Xeon Phi for “cheap” 

    (This work and post were originally from early 2015, some aspects may still be useful, eg the kernel patch for the lower end motherboards)

    Recently Intel has been selling their a version of their Xeon Phi coprocessor under a promotional deal at 90% off.  This means that one can get a device with 8GB of ram (on the coprocessor) and 228 hardware threads (57 physical cores, and each with 4 hyper-threads) at a reasonable price of ~$200.

    When I first purchased the Phi, I was planning to put it into somewhat of an old desktop system that I had lying around, however the motherboard did not support the major requirement of “Above 4G decoding” on the PCI bus.  4G decoding deals with how the system allocates the memory resources on items on the PCI bus.  With the Intel Phi, unlike consumer level GPUs it will present all 8G as a memory mapped region to the host computer.  (more about 4G decoding)   Based off some research on this obscured feature, it appeared that most “modern” motherboard have some support for this feature.  I decided to get an Asus h97m-plus which is fairly cheap, and fit the computer tower that I already had on hand.  While this motherboard does list the above 4G decoding in its bios and manual, I am not actually sure if this feature has been properly tested, as unlike Asus higher end motherboards, there was no mention of this mother board specifically working with the above 4G decoding.  Based off examining the early booting sequence it appeared that the Linux kernel was attempt to find alignment positions for PCI devices which were equal in size to the requested memory region (8GB in this case) or depends on the BIOS to perform the PCI allocation before booting.  For the higher end motherboards which the Intel Phi was known to work with, it appears that the “more powerful BIOSes” were allocating memory for the Phi, but in the case of this lower end motherboard, the BIOS was unable to deal with a request to allocate 8GB of memory and thus falling back onto the kernel to perform allocations.  Following this observation, I made a small kernel patch (here) which changes requests for alignment larger than the maximal size to be simply aligned at the maximal supported size.  With the  components in this computer it appears that even with this change the Intel Phi gets aligned to a 4GB boundary and is able to still function correctly.

    The next challenge once the Phi was communicating with the computer was to prevent the chip from overheating.  The discounted versions of the Phi did not include any fans as it was designed for use in server environments.  Additionally being a 300+W accelerator, the system is capable of generating a lot of heat.  As such, many “typical” fan solutions that I tried failed to keep the chip cool for longer than a few minutes.  I eventually landed on the high-powered tornado fan which can move over 80 cubic inches of air a minute.  I ended up having to zip tie this over one end of the chip to ensure that there was enough directed airflow to keep it functional.  (warning to future users: This fan actually does sound like a tornado, constantly).

    Having the entire system functional for over a year now, I have managed to use the Phi for a handful of computations.  While there is decent opportunity in improved performance, this chip really requires that you design customized software for it specifically.  This is especially true given that Intel Phi is less popular than graphics cards with Cuda, where many mathematical tools and frameworks already have customized backend targeting Cuda requiring limited effort on the user’s part.  While this chip has a nice promise of being able to execute normal x86 instructions, this seems to be of fairly limited use since the only compiler that will target the chip and use its specialized vector instructions is Intel’s own compiler (similar in nature to Cuda).  This makes it fairly difficult to natively run any non trivial programs on this chip as any external libraries require their own porting effort.  (As an accelerator which accelerates embedded methods similar to Cuda this chip works fine, just if you are trying to run a program without the hosts involvement.)

     

     

    Photos of the setup:

    This slideshow requires JavaScript.

     
  • Matthewfl 9:00 pm on 27 October, 2014 Permalink | Reply
    Tags: ideas   

    Estimating the n percentile of a set 

    Here is an interesting idea that I had recently, this is just a high level concept about how it would work, there are no proofs for error bounds or quality, in fact there would be a handful of orderings of sets which would product terrible results.

     

    To accurately compute the n^{th} percentile value of a given set of values, one ends up having to sort the values which if they are not integers, can take O(n \log n) time.  However, then getting value itself is trivial since it is simply going to the correct place in the sorted values.  I am thinking that one should be able to compute an estimate for the n^{th} value for a randomly ordered set of elements in O(n) time.

    First the basic idea, imagine that you have a set of elements X = \{ x_i \}.  Then if we had this set sorted S, then finding the n^{th} (0 \le n \le 1) percentile out of X would simply become s_{n * |S|}.  This implies that we have n * |S| elements less than s_{n *|S|} and |S|*(1 - n) elements greater.  From this point we can imagine constructing two sets, \alpha = \{ x \in X : x < s_{n * |S| } \}, \beta = \{ x \in X : x > s_{n * |S|} \} which represent the elements greater and less then the n^{th} value.  This also means that \frac{|\alpha|}{|\alpha| + |\beta|} \approx n.  Now using this concept for \alpha and \beta, we can attempt to construct these sets while iterating through X by having a current estimate for the value s, and then tracking the elements current in each set.  This essentially becomes if \frac{|\alpha|}{|\alpha| + |\beta|} > n + \epsilon then take the current value of s and insert it into \beta, then take the largest element out of \alpha and set it equal to s.  In the case of \frac{|\alpha|}{|\alpha| + |\beta|} < n - \epsilon we simply do the reverse by inserting the current value of s into \alpha, and then removing the smallest value from \beta and setting it equal to s.

    Now the problem has been reduce to splitting X into two different sets and keeping them sorted somehow to be able to get and remove the largest/smallest elements.  However, this would give an exact answer to finding the n^{th} percentile.  Now given that we want to find an estimate, we can imagine capping the size of these sets to k, where k is a small number such as 2.  Then instead of tracking the elements themselves, we are simply counting the number of elements that are greater or less than the current s value.  Additionally, we have the sets tracking the k elements that are largest but still less then s, and the smallest but greater then s.  As we iterate through the set, we can track the k values in \alpha, \beta and the size of \alpha, \beta accordingly, and when we want to change the value of s to keep \frac{|\alpha|}{|\alpha| + |\beta|} \approx n, then we just take the new value from \alpha or \beta respectively and update the cardinality of each set.

    An additional extension to this algorithm could be for an n \approx .999 then the size of \beta would only be \frac{1}{1000} the size of the original data set.  Keeping track of largest .001 of the data set would not be linear in the side of the data set, however it could take out a considerable chunk of computation depending on how large or small the value of n is.

     
  • Matthewfl 3:36 pm on 26 August, 2014 Permalink | Reply  

    Reducing specific use cases in a language to improve overall usability 

    This last summer I spent a considerable amount of time refactoring i-lang.  Since I started implementing this programming language in early 2012, it had accumulated quite a bit of cruft, and it was difficult to continue moving forward.

    Refactoring type system

    One of the first internal improvements that was made was to overhaul the internal type system.  Before, the type system was simply passing around a boost::any.  However, this became trouble some as all parts of the code would have to know about each type so that it could cast it locally.  In many places the code began to look like:


    if(a->type() == typeid(Object*)) {
     a = boost::any_cast<Object*>(a);
     // ...
    } else if(a->type() == typeid(Array*)) {
     // ...
    }

    It became even worse when there were two different types involved, as can be seen in the case of performing arithmetic.

    Now, the type system has been rewritten to better make use of C++ template and virtual function systems.  This means that one can write code like:


    ValuePass a = valueMaker(1);
    ValuePass b = valueMaker(2.4);

    ValuePass c = a + b;

    assert(c->cast<float>() == 3.4);
    assert(c->cast<int>() == 3);
    assert(c->cast<bool>() == true);
    assert(c->cast<std::string>() == “3.4”);

    The real beauty of this type system can be seen when using the foreign function interface, where the value of arguments can be “injected” into local variables.  This means that a function can be written as:


    ValuePass func(Arguments &args) {
     int a;
     double b;
     std::string c;
     ilang::Function d;

     args.inject(a, b, c, d);

     // The arguments are casted to the type of the local variable
     // in the case that a cast fails, an exception will be raised
     // which is the logical equivalent to calling a function with
     // the wrong type signature

     // ... (body of function)

     return valueMaker("hello world");
    }

    Changes in the type system at the language level

    Before this refactor, types in i-lang were defined in a global table of identifiers called variable modifiers.  A variable could have more than one modifier attached to it, and each modifier is used to check the value being set to a variable.  What this roughly translates to would be something like:


    define_type("Unsigned", {|a|
     // check if the value being set, which is passed as argument a, is greater or equal to 0
     return a >= 0;
    });

    // use this type like
    Unsigned Int b = 5; // both Int and Unsigned are called to validate the value of '5' is the correct type.

    Looking at this implementation of a type system, it does not seem that bad when compared to other programming languages.  As displayed here it is missing the concept of a namespace or import scope, but otherwise it is fundamentally a type system where types are given names and then later used to reference that type.  However, this concept of a type having a name fundamentally goes against i-lang’s concept of names only being used as place holders for values, vs having an explicit places in the language. (eg: class required_name_of_class {} vs name_bound_for_use_later = class {}).  This lead me to question what does a type system fundamentally do.  In lower level languages such as C/C++ a type system provides information about the space required for an object, however in a higher level language such as python (which i-lang is more similar to on this point) values are just fixed sizes and then pointers to larger dynamically sized objects when required.  Type system also provided casting between primitive types, such as a 4 byte integer casted to a floating point.  This on its own isn’t that interesting as there are limited number of primitive types and similar operations can be accomplished with code like `1 * 1.0` or `Math.floor(1.2)` for casting.  Finally, type systems provide a way to identify the type of some value, which can be further used by a language to provided features such as pattern matching when calling a function.  Now, choosing to focus on this last issue of a type system lead to i-lang concept of a type system, which is that a type is simply a function which can identify if a value is a member of a given type.

    The idea of using just a function to identify a type can sound a little strange at first, however, after playing with it some, the idea itself can be seen to be quite powerful.  Here is a quick example of using this type system to implement pattern matching on the value passed to a function.


    // This is a function which returns a function that compares the value
    // the returned function gets with the value the first function was called with.
    GreaterThan = {|a|
     return {|b|
      return b > a;
     };
    };
    LessThan = {|a|
     return {|b|
      return b < a;
     };
    };
    EqualTo = {|a|
     return {|b|
      return b == a;
     };
    };

    Example_function = {|GreaterThan(5) a|
     return "The value you called with is greater than 5";
    } + {|LessThan(5) a|
     return "Then value you called with is less than 5";
    } + {|EqualTo(5) a|
     return "The value you called with is equal to 5";
    } + {
     return "The value you called with didn't compare well with 5, must not have been a number";
    };

    Int GreaterThan(0) LessThan(10) int_between_zero_and_ten = 5;

    In the Example_function, we are combining 4 different functions, each with different type signatures.  Additionally, we are creating types on the fly by calling the GreaterThan/LessThan/EqualTo functions which are using annoymious functions and closures.  This method also allows for classes to have a place in the type system.  We can now easily create special member of a class to check if a value passed is an instance or interface of the class type.


    sortable = class {
     Function compare = {};
    };

    // check that the array members all implement a compare function and are an instance of a class
    sort = {|Array(sortable.interface) items|
     // ...
    };

    Refactoring Objects and Class to appear like functions

    Before, i-lang use syntax similar to Python or JavaScript dicts/objects when constructing a class or object.  This meant that these items looked like:


    class {
     Function_to_check_type type_name: value_of_type,
     Another_type another_name: another_value,
     no_type_on_this: value
    }

    However, in ilang, except when prefixed with `object` or `class` the use of `{}` means that it is a function.  (eg: a = { Print("hello world"); };)  Additionally, colons are not used anywhere else in the language, which made me question why was this case such a special one.  This lead me to ask why not use equal signs and semicolons like everywhere else, meaning that defining a class would appear as:


    class {
     Function_to_check_type type_name = value_of_type;
     Another_type another_name = another_value;
     no_type_on_this = value;
    }

    Furthermore, is there any reason to exclude loops and if statements when constructing a class?  Allowing control flow when the class definition is constructed makes this act identical to a function.


    a = true;
    class {
     if(a) {
      b = 5;
     } else {
      b = 6;
     }
    }

    Final thoughts

    By starting by cleaning up the internals of i-lang, it allowed me to take another look at the language and determine why certain choices were made at first.  Bootstrapping a new programming language takes a considerable amount of effort, and could easily lead someone to defining something like print a function (as python recently changed away from in version 3).  In my case, I was creating special mechanisms for constructing class/objects and defining types for variables largely because the type system, scopes, and function internal interfaces were all poorly designed in the first iteration of the language.  Now that the internals have been cleaned up, it makes it easier to see that these changes are wins for the language.  Now, I doubt that I would have been able to come up with these changes right off the bat with the first implementation, as it was only through the pain of the first implementation for which the need for these changes became apparent.

     
  • Matthewfl 12:24 am on 28 January, 2013 Permalink | Reply  

    Current state of ilang 

    This is the first week back at school, which means that the speed of development on ilang will begin to fluxuate again. Over this last break, and the final parts of last semester, I was able to clean up a bunch of features with the ilang programming language.

    When I originally set out to create this new language, I was mostly thinking about how the syntax would be different and the specific features that I would want in the language to make it worthwhile of being created. However, what I failed to thinking about was what really make a programming language useful today is the fact that there are an absurd amount of libraries already programmed, debugged and available for download for any successful language. As a result of this realization, I have been working on trying to get useful libraries written for the language. However in trying to work with the language, I have a few beliefs that I am trying to stick with, but I am not so such how well it will work out.

    The first belief, is that there should be no need to pause or have any sort of timer system, this is because I feel as if the language should attempt to run as fast as possible and focus on processing data. However when writing testing frameworks to automatically check if the language is working, it has become apparent that it would be useful to have some timer system. ** I still haven’t written in the timer system so this is still somewhat of an issue of internal debate.

    Along with the timers, there is the problem of getting data efficiently in and out of the programming language. One of the concepts that I have for the future with this language is that the system will be able to distribute the computations between a large number of computers, this means that it is not particularly natural for the system to have access to features of the local computer, such as the file-system or the standard in/out. I am thinking that for the time being that the system could be designed to have access to the standard input and part of the file-system, however when the system becomes networked across a lot of computers, there could possibly be a way to specify where the standard input/output should go along with where the file-system has access to. The other alternate that I am working on, is using the concept of just having the http server be the way to input data, however I expect that it will become cumbersome quickly to input large data files. A possible compromise is to use the parameters to support some way to map specific files or folders to names that can be access from inside the program.

    When considering the libraries that have already been included, there is still a lot of space for development. The modification library is still lacking too many features to be really usable. Along with the modification, the database library, still lacks the ability to save arrays into the database. The complication with arrays is trying to figure out an efficient way to store the data without causing a significant slow down. My plan for arrays in the database, was that they would be “smaller” then objects in the database, as objects when stored in the database do not have any limiting factors for the number of elements. However with arrays, I plan to try to have the system load all the data into memory when reading an array from the database. However the current way that the system is designed, it does not allow for the elements, to be easily accessed under the hood. I am thinking that the system might try and store all the elements in their own container, however the problem with this is that there would be a large number of database queries when trying to read data out. And inserting in the middle of the array would require a large number of read and writes into the database. However, on the flip, if the system was using one database object to store all of the array elements, there would be a few read and writes, but the object would likely be come very large very quickly.

    The current plan for future features, is to keep working on adding more libraries into the core system to be useful. This will mostly focus on processing and interpreting data. (Web server, http client, as well as methods to parse string such as regex, also expecting some form of CSV or sqlite reader for when working with data that has been downloaded). Along with supporting reading the data, I plan to attempt to include a wide range of machine learning and artificial intelligence libraries and tools. Hopefully, it will be easy enough to integrate their database systems with the ilang database. Once these components are in somewhat of a usable state, I should have some framework for which experiments with the code modification library can be conducted.

    Random last thought:
    I currently plan for the ilang system to have a formatting tool, much like golang does. The reason for this, is when working with the modification system, I plan to have the system completely regenerate a file using the system “print from code tree” feature. This should greatly simplify writing the code back to disk, when compared to other possible ideas such as trying to find where the code has been changed with corresponding lines and then try to recreate those changes on disk.

     
  • Matthewfl 3:48 pm on 22 September, 2012 Permalink | Reply
    Tags: 2012, , fair, job   

    Job fair 

    Wednesday of this last week, I went to a EECS job fair.  I found that essentially all companies there were eager to talk with anyone that came by and take a résumé.  I have even gotten some contact back from some companies already which I was not expecting as I am a first year student and I was told by many older students that first years do not typically get contacted or get internships/jobs.

    I think that this point brings up some very interesting beliefs that are in the tech industry.  Many of these have been noted before on countless blogs and new articles, but rehashing these from my own experiences I believe might be helpful to some people.

    1. The tech industry does not particularly care about you age, gender, race etc.  All they care about is if you are technically skilled and are able to get the job done.
    2. Github profiles are a big deal.  At the top of my résumé along with my physical address, I decided to put my internet address.  This included things such as my email, website and github profile.  I want to note that even while talking with an individual he was looking at my résumé said “o nice you have your link to your github profile” and then continued to circle it with his pen and said that he was amazed how many people he talked to that did not have some public code profile.  Today this “public code profile” has become a standard for hiring in the coding world.
    3. Do not emphasis what you do not have when talking with the representatives.  I was waiting behind some student who was talking with the hulu representatives about what he has.  First he starts out with what he does not like about the hulu product, the fact that there are ads even though he is paying for it (guess what you pay for Cable and there are still ads, there is no reason hulu can’t do the same).  The representatives then interrupts him and asks about what sort of projects he has.  He states that he has made a few smallish things.  The representatives then continues to ask if he has a github (see point 2).  Which he replies that he does, but there is nothing on there because…..some answer like, “my ideas are soooo great that I do not want people coping them I might sell them at some point…..”

    These are somewhat of tips/points/what not to do experiences.  Like I said at the top, these ideas have been noted all over the internet and are not rocket science.

    Additionally, in line with my last post about hackathon projects.  Everything that you write should be version controlled somehow.  You can use git without using github and just keep it on your local machine.  Additionally, when you decided that your code is either “done” or not going to continue into a money-making company, or only going to survive as a free product, then you might as well create a public repo on github or similar so that if/when you are at a job fair, there is something on

     
  • Matthewfl 10:28 pm on 15 September, 2012 Permalink | Reply  

    The Hackathon paradigm 

    Today I was looking at a lot of the different applications that I normally use on my phone and through my web browser.  If I was talking to someone who had never experienced either of these before, they might believe that I generally have a single specific device for a specific task, and that in terms of functionality there would be little overlap of major features.  However, for anyone that has experienced either of these mediums, they are aware of the wide variety of applications and programs that duplicated the functions of other applications.

    My complain is two-pronged on this issue of applications that start or continue with a Hackathon paradigm.  First, the old Unix philosophy says do one thing and do one thing well.  On this specific point, I believe that many applications start out with the right intentions, however over time there is a significant feature creep effect that takes place.  I believe that this is the result of “Hackathon project” becoming more than “Hackathon projects.”  The developer of these application feel that they are going to form a company with a project that in terms of complexity should really be no more than a side project.  Essentially what I am saying, is to develop and maintain your application X, it _might_ take 2 hours a week once it is launched.  However, these individuals choose to attempt to make a 50 hour, startup styled, work week out of these types of projects.

    My second issue with the “Hackathon projects” is don’t assume that something that you can write in 24 hours is not easily copied.  There are a lot of very complex and difficult problem that exist in the world today.  Nearly all of these types of problems can not be solved in a short period of time.  Additionally, if a product can be made in 24 hours given the proper tools and skills, then it is only a matter of time before there is a large number of people who will be competing with you.  Some might even have better products given that they were able to replicate your product in such a short period of time, and then vastly improve upon there after.

    With these two issues, I am not saying that Hackathons are bad,Hackathons provide a relativity easy way to create demos to show of skills.  However when it comes to publishing a product I believe that people should think a little more about what they are going to create and invest enough time into the product such that it is not just going to be another 1 of 100 “identical” products.

     
  • Matthewfl 1:26 am on 8 September, 2012 Permalink | Reply  

    ilang – a new programming language 

    I have been working on developing a new type of programming language over the last few months.  From a physiological perspective, it is interesting to try to create ones idea programming language and see what they create, what features does one add, change or remove.

    ilang is still very pre-alpha-ish software, and I don’t expect anyone to jump and start using or even download and try it at this point, there are still too many things unimplemented at the moment to call it a complete language.

    An overview of what is different:

    • The power of anonymous.  In may programming language, functions, classes and may other types are given names that can be looked up inside the type system.  However in ilang, attempts to have classes and function and other basic types anonymous, or without names.  The names are viewed as being useful to the programmers who are writing the programs.
    • Short access to making function: a function is anything between {}.  this means that to create function main, it looks like: main = {};
    • optional typing: This seems to be a new and growing trend in programming languages that are coming out now.  By default the types on variables are not checked at all.  This means that more than one check can be imposed on a variable.  Also additional types can be easily encoded with some additional C++ code, and soon ilang code from within the language itself.  The type checking can also do some other interesting things, more later.
    • Built in database: This has always been a feature that I think higher level languages should include, now web browsers include localStorage for example.  This feature can already take all primitive types and objects.  classes and functions can not yet be encoded into the database.  However the advantages having this built in as already noticeable in testing.
    • Python style imports, but only at the top and no ‘as’, *.  I originally made it this way, because I was annoyed when some code I was reading through would import something in the middle.  One you have to find where the import was performed to figured out what is being included into the source, also if you go back to modify above the point where the import was performed then you have move the import up so that it will be available.

    To come/planned/supppppper pre-alpha features:

    • Access to the parse tree of files and the ability to modify the tree, changing the code that is running.  There will be the typical system where it is able to access the direct parse tree in a ‘raw’ format, however I plan to experiment some and try and find some natural way to access and modify the syntax tree.  In the aspect of natural modification, I have already noticed some of these properties being easily implemented as a function can easily be replaced by simply overwriting its value.
    • Network distribution.  I am hoping to turn this language into something that is useful when it comes to processing large amounts of data.  The trend at this point has been to utilities a large number of computer and attempt to distribute tasks in a sort of map reduce framework.  The plan at this point is to allowed for unstructured computation across the network where the system automatically determine if it is most effective to move a copy of the code for the computation or to move the data that the computation is working on.

    Very incomplete documentation

    Link to Github repo

    This is only the first intro post.  I believe that there will be more to come as the language further advances and develops.

     
  • Matthewfl 1:08 pm on 28 July, 2012 Permalink | Reply
    Tags: comments   

    Comments about another state 

    This post was written over a number of days in a disjoint fashion. You were warned.

    Already in the first days of this vacation I have come up with a number of amusing comments and I felt that it was only right that they be recorded somewhere.

    First the air port that we flew into had something like two terminals. I was joking that there would be one person working TSA and when we came back through they would simply look at us and ask if we are going to blow up the plane and then let us through. I was somewhat close in that there were two people working TSA, however they did appear to have the basic requirements to be called TSA.

    Second when we got in the car the radio came on and started playing the most stereotypical country music came on and started saying “she thinks my tractor is sexy.” The next time that we turned on the radio it was a song about loving their front porch more than anything else.

    The place that we are going to for our volunteer activities was considerably farther away then originally expected. As a result my mother was driving somewhat above the speed limit. As she claims this is her first ticket in a while, gg. No California roles here.

    We are the volunteering on the blackfoot reservation this last week. It has been somewhat slow, it appears that it has something to do with the native culture. They believe that it will end up working out and stuff.
    The main thing that we have been doing this week is helping the community college get themselves together for their next semester. They are quite small with only 500 students and a few classrooms.
    When we are not helping Getty the college ready, we have been interacting with the community.

    Today we are leaving the volunteering group. We are going to be driving up and around to do a ‘normal’ vacation.

     
  • Matthewfl 10:53 am on 18 June, 2012 Permalink | Reply  

    A review of a year 

    Looking back, it has been a while since I last updated this blog.  I think this is because it was a crazy fully packed year.  Additionally my time was consumed by great projects such as EDD of which we worked in secret to prevent our opponents from discovering what we were doing.

    Looking back this year start like a typical senior year.  There were good and there were bad classes all wrapped up.  Thinking back to first semester I was talking 8 periods worth of classes filling every second of my school day.  On top of the classes there was the constant need to work on college applications and the constant draw to manage the EDD project.  Just thinking about this now is making me tired.

    Second semester was not any less tiring than the first.  While I decided to take one less class giving myself some free time during the middle of the day, FRC and EDD started their full on attack.  During the six weeks of the FRC season there was a lot of complications.  For starters there was the complications of figuring out how to shoot the ball.  Additionally, there was the complication of balancing on the bridge at the end to get the absurd amount of bonus points.  In an attempt to accomplish this we built a challenging frc robot.

    At the same time as the FRC build season, there were many complications in the EDD class.  We were having to save the rover team as a joint effort between the two EDD teams.  The rover was designed by members of both teams to be the robot that we were going to rescue during the mission, but what happened is the rover team was unable to accomplish their mission of building the rover and thus required the attention of both teams to ensure its success.

    After the build season for FRC, EDD started stepping up its game.  We started really working on our countless iterations on our frame designs for our sphere bot and adapters that we were working on for the propellers.  It was during this time when we discover the complications with the propellers and lift.  Our complications stemmed from the beginning with the RC community and not a full understanding of aerodynamics.  According to the RC community if one were to spin the propellers infinitely fast there would be an infinite amount of thrust.  Of course this does not make sense, but the point at which the equations were said to break down is where the pressure above is not able to replenish the air quickly enough to provide lift.  What we did not know is there are also the limits of the propellers flattening out which causes the propellers to not provide more thrust at a higher speed.  At this point we started our many iterations in an attempt to find possible solutions to our problem.  Out first attempt was to greatly reduce the mass of the robot, we were able to cut the weight in half.  Second we attempt to the most complex parts to date in our machine shop, the manufacturing of our own propellers.  Additionally we looked into making a larger frame, as going to propellers that were only 4″ larger in diameter were designed for 1.5 Kg vs 200 g helicopters which was a lot closer to the range that we were looking for.  We tried all of these designs in parallel which was a stunning effort and props (pun intended) to everyone who contributed.  In the end there were still complications with getting the flybar working, and at the last second our last flybar broke and we were unable to replace it.  In typical EDD fashion, we look back and say if only we had another month we could have accomplished what we set out to do (or if only we did not waste 6 weeks attempting to build the rover then maybe the sphere would have flown).

    After all of the academics and robotics for the year was over, there were what now seems like an eternity of senior activities and gradutation to sum up the year.

     
  • Matthewfl 7:20 am on 15 March, 2012 Permalink | Reply
    Tags: competition, ,   

    First day of FRC robotics 

    Today is out first day of the robotics competition.  There is still a lot of work that needs to be done.For starters, the robots drive system needs to be rebuilt.  What happened when we shipped the robot was they were unable to find the right size gears.  Second the intake system needs to be taken apart and have is belts put on.  Finally the bbad needs to be installed (we just finished machining 2 days ago).
    Once the mechanical team finished the electrical and programming team get to go to work.  The robot has not even begin to be weired.  Additionally programming team has written a bunch of code hopping that robot will work, but because the robot was “finished” only minutes before we bagged it there was no chance to test.
    All in all, if the robot I’d working by the end of Thursday, it is going to require a miracle.

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel