2011
04.25

Materialize Maven Modules in Eclipse

This note relates to:

  • Eclipse SDK 3.6.2
  • m2eclipse plugin 0.12.1
  • subclipse 1.6.17

In Eclipse, modules in a Maven project are materialized separately when the project is first checked out of a Subversion repository. However, when new modules are loaded as a result of a repository update, the new modules are not automatically materialized.

To prompt Eclipse to create new projects for the acquired modules:

  1. Right-click on the project containing the new modules
  2. Select Import… from menu
  3. Select Maven > Existing Maven Projects and press Next button
  4. Set the check boxes next to the projects to materialize and press Finish button
2011
04.25

This note relates to Eclipse SDK 3.6.2.

When editing a Java source file, in Eclipse, warnings are reported when an import statement is specified but not used. Unused import statements can be removed automatically by using the Java editor menu entry Source > Organize Imports.

This can be automatized when the file is saved using the following approach:

  1. Open “Preferences” page via menu Window > Preferences
  2. Navigate to Save Actions for Java Editor (Java > Editor > Save Actions)
  3. Select Perform the selected actions on save
  4. Select Organize imports
  5. Press button Apply

From this point on, when a Java file is saved, import statements will automatically be cleaned.

2011
04.05

Better Passwords with a Reasonable Effort

In this note, principles of good password creation are offered and discussed. At the end of the note, a process is offered to create passwords that follow the presented principles and require a reasonable mental effort.

Many “easy” tricks are offered over the web. It is up to the reader to analyze those approaches against the principles offered here.

Related topics:

Principles

Characteristics of a good password:

  • a password should be used only once for a given purpose
  • the compromise of one password should not compromise other passwords
  • a password should contain a large amount of entropy

Password used only once:
It is necessary to use a password only once. This is important in case one of your password is compromised. For example, let’s say you have an account with two services (Google and Facebook). If the same password is employed for both services, then an attacker who becomes aware of the password for one service can also access the other service.

Password compromise:
Passwords should not be related in such a way that knowledge of a password for one service reveals passwords for other services. For example, although the following passwords “123google”, “123facebook” and “123amazon” are different, an attacker discovering one would easily guess the others.

Entropy:

Entropy represents the amount of chaos associated with an entity. In the case of passwords, entropy is related to the number of tests an attacker would have to conduct to test all the possible passwords.

For example, if a bike lock was made of 4 numbers ranging from 1 to 8, then the total number of possible combinations would be 8 * 8 * 8 * 8 = 4096 combinations. Therefore, an attacker without knowledge of the proper combination would have to try at most 4096 different combinations to unlock the bike.

Entropy is expressed in the number of bits required to hold the total number of combinations. In the bike lock example, 4096 can be expressed in a value with 12 bits. This can be verified since 2 to the power of 12 is 4096. Each rotating wheel in the bike lock contribute 3 bits of entropy since each wheel can have 8 different position (2 ^ 3 = 8). If the same lock had only three wheels, the lock would provide only a total of 9 bits of entropy.

In the bike example, if each wheel could occupy 10 positions, instead of 8, then each wheel would provide 3.32 bits of entropy (2 ^ 3.32 = 10) and a bike lock made of four of those wheels would provide a total of 13.3 bits. Indeed, the lock provides 10 000 different combinations, which is 2 ^ 13.3.

What needs to be remembered from this exercise are the following concepts:

  • each position in a lock provides an amount of entropy
  • the larger number of combinations in a position means a larger amount of entropy
  • the total amount of entropy provided by a lock is the sum of entropy provided by each independent position in the lock

Passwords are similar to Bike Locks

A password is similar to a bike lock where each character in the password represents a mechanical wheel that can take a number of different values.

If a password is made only of lowercase letter (26 values), then each character is worth 4.7 bits of entropy.
If a password is made of lower and upper case letters (52 values), then each character is worth 5.7 bits of entropy.
If a password is made of lower and upper case letters, along with numbers and special characters (72 values), then each character is worth 6.2 bits of entropy.

To be safe, a password should have at least 64 bits of entropy. A great password should have 128 bits of entropy.

A question easily comes to mind: should each password be at least 11 characters? The answer is yes. Continue reading, a trick is given below on how to create long passwords and easily remember them.

Entropy Revisited

Astute readers might offer a password made out of words such as:

OrangeApple

There are 11 characters in this password. Since characters range in lower and upper cases, then each character offers 5.7 bits of entropy, which would mean 62.7 bits of entropy, right? No. If a password is made out of words, then the characters are related and not independently random. Therefore, an attacker trying words and not characters might find the password in a smaller amount of tries.

There are 171,476 English words in current use (between 17 and 18 bits of entropy for each word), so OrangeApple is worth only 34 bits of entropy, not 62.7.

Password Principles Restated

Characteristics of a good password:

  • a password should be used only once for a given purpose
  • the compromise of one password should not compromise other passwords
  • a password should contain a large amount of entropy (minimum 64 bits, better if 128 bits and above):
    • many characters in the password
    • varied characters (lower and upper cases, numbers, special characters)
    • unrelated characters (avoid whole words)

Mental Approach to Creating Better Passwords

Create passwords by combining the following tricks:

  • high entropy constant reused between all passwords to provide a minimum amount of entropy
  • add a variable component based on the context the password is used
  • use a mental transformation known only to you

Constant Component
Create a long string of seemingly random characters by using a saying you want to repeat yourself. Use a phrase that means something to you. Make it a message that will improve your life, since you will type it all the time. For example:

RoYoTiEv8Ki!

The above string of characters can easily be remembered if one uses the phrase: “Rotate Your Tires Every 8000 Kilometers!”. This component yields approximately 72 bits of entropy.

Variable component
This is the easiest part. The variable component should be based on the context in which the password is used. If this is a password for Google, then the variable part could be “google”. If the password is used to unlock your laptop, then the variable part could be “laptop”. The variable component should be easy for the password owner to recall from the context in which the password is used.

Mental Transformation
The aim of the mental transformation is to hide the variable part from a potential attacker. Continuing the examples above:

Password for Facebook: AfRoYoTiEv8Ki!cebook

Password for Google: OgRoYoTiEv8Ki!ogle

In these examples, the mental transformation consists of:

  • taking the first two letters of the variable component, reversing them and putting them in front of the constant component; and,
  • taking the remainder of the variable component and appending after the constant component.

In these examples, the entropy provided by the password is always at least 72 bits. The approach follows the presented password principles and, with a bit of practice, requires little mental effort.

Each reader should find for himself/herself a suitable mental transformation which is personal and original.

2010
12.19

Writing a Reduce Function in CouchDb

This note relates to:

  • CouchDb version 1.0.1
  • curl version 7.21.0
  • Ubuntu 10.10

References:

This article discusses some of the details in writing a reduce function for a CouchDb view. A reduce function is used to perform server-side operations on a number of rows returned by a view without having to send the rows to the client, only the end result.

The tricky part of reduce functions is that they must be written to handle two “modes”: reduce and re-reduce. The signature of a reduce function is as follows:

function(keys, values, rereduce) {
   ...
}

If the parameter “rereduce” is reset (false), then the function is called in a “reduce” mode. If the parameter “rereduce” is set (true), then the function is called in a “re-reduce” mode.

The aim of a reduce function is to return one value (one javascript entity, a scalar, a string, an array, an object…) that represents the result of an operation over a set of rows selected by a view. Ultimately, the result of the reduce function is sent to the client.

The reason for the two modes is that the reduce function is not always given at once all the rows that the operation must be performed over. For efficiency reasons, including caching and reasons related to database architecture, there are circumstances where the operation is repeated over subsets of all rows, and then these results are combined into a final one.

The “reduce” mode is used to create a final result when it is called over all the rows. When only a subset of rows are given in the “reduce” mode, then the result is an intermediate result, which will be given back to the reduce function in “re-reduce” mode.

The “re-reduce” mode can be called once or multiple times with intermediate results to produce the final result.

Therefore, the tricky part of reduce function is to write them in such a way that:

  1. the keys and values from a view can be accepted as input
  2. the result must be convenient as the output for the client
  3. the result of the reduce function must be accepted as input in the case of “re-reduce”

The remainder of this note is an example of a reduce function that computes simple statistics over a set of scores. The example follows these steps:

  1. Create a database in CouchDb
  2. Install a design document with the map and reduce function that is tested
  3. Load a number of documents, which are score results
  4. Request the reduction to access the expected statistics

In this example, it is assumed that the CouchDb database is located at http://127.0.0.1:5984. Also, it is assumed that there are no assigned administrators (anyone can write to the database).

Create Database

curl is used to perform all operations.

curl -X PUT http://127.0.0.1:5984/db

Install Design Document

Create a text file named “design.txt” with the following content:

{
    "_id" : "_design/db"
    ,"views" : {
        "stats" : {
            "map" : "function(doc){
                if( typeof(doc.name) === 'string'
                 && typeof(doc.score) === 'number' ) {
                    emit(doc.name, doc.score);
                };
            }"

            ,"reduce" : "function(keys,values,rereduce){
                if( rereduce ) {
                    var result = {
                        topScore: values[0].topScore
                        ,bottomScore: values[0].bottomScore
                        ,sum: values[0].sum
                        ,count: values[0].count
                    };
                   
                    for(var i=1,e=values.length; i<e; ++i) {
                        result.sum = result.sum + values[i].sum;
                        result.count = result.count + values[i].count;
           
                        if( result.topScore < values[i].topScore ) {
                            result.topScore = values[i].topScore;
                        };
                        if( result.bottomScore > values[i].bottomScore ) {
                            result.bottomScore = values[i].bottomScore;
                        };
                    };
                   
                    result.mean = (result.sum / result.count);
               
                    log('rereduce keys:'+toJSON(keys)+' values:'+toJSON(values)+' result:'+toJSON(result));
               
                    return result;
                };
               
                // Non-rereduce case
                var result = {
                    topScore: values[0]
                    ,bottomScore: values[0]
                    ,sum: values[0]
                    ,count: 1
                };
               
                for(var i=1,e=keys.length; i<e; ++i) {
                    result.sum = result.sum + values[i];
                    result.count = result.count + 1;
                   
                    if( result.topScore < values[i] ) {
                        result.topScore = values[i];
                    };
                    if( result.bottomScore > values[i] ) {
                        result.bottomScore = values[i];
                    };
                };
               
                result.mean = (result.sum / result.count);
               
                log('reduce keys:'+toJSON(keys)+' values:'+toJSON(values)+' result:'+toJSON(result));
               
                return result;
            }"

        }
    }
}

Load design document:

curl -X PUT http://127.0.0.1:5984/db/_design/db --upload-file design.txt

Load Documents

curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Alicia","score":85}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Beth","score":87}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Carmen","score":58}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Dalida","score":62}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Elizabeth","score":71}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Fiona","score":75}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Gertrude","score":94}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Halle","score":76}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Irene","score":82}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Julia","score":73}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Kim","score":75}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Lynn","score":91}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Mary","score":56}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Nancy","score":66}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Olie","score":80}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Pat","score":69}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Queen","score":89}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Roseline","score":93}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Sally","score":62}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Trudy","score":71}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Una","score":80}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Victoria","score":79}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Willow","score":68}'

Consume View and Reduction
To see the output of the view:

curl -X GET http://127.0.0.1:5984/db/_design/db/_view/stats?reduce=false

The following result should be reported:

{"total_rows":23,"offset":0,"rows":[
{"id":"7ab05a72d3cf2ad68c5816713e07efc5","key":"Alicia","value":85},
{"id":"7ab05a72d3cf2ad68c5816713e07f78f","key":"Beth","value":87},
{"id":"7ab05a72d3cf2ad68c5816713e07f81a","key":"Carmen","value":58},
{"id":"7ab05a72d3cf2ad68c5816713e0804bc","key":"Dalida","value":62},
{"id":"7ab05a72d3cf2ad68c5816713e081063","key":"Elizabeth","value":71},
{"id":"7ab05a72d3cf2ad68c5816713e081657","key":"Fiona","value":75},
{"id":"7ab05a72d3cf2ad68c5816713e081cf7","key":"Gertrude","value":94},
{"id":"7ab05a72d3cf2ad68c5816713e0824d5","key":"Halle","value":76},
{"id":"7ab05a72d3cf2ad68c5816713e08349e","key":"Irene","value":82},
{"id":"7ab05a72d3cf2ad68c5816713e083a75","key":"Julia","value":73},
{"id":"7ab05a72d3cf2ad68c5816713e083c86","key":"Kim","value":75},
{"id":"7ab05a72d3cf2ad68c5816713e0845b6","key":"Lynn","value":91},
{"id":"7ab05a72d3cf2ad68c5816713e084c70","key":"Mary","value":56},
{"id":"7ab05a72d3cf2ad68c5816713e085c23","key":"Nancy","value":66},
{"id":"7ab05a72d3cf2ad68c5816713e0863dc","key":"Olie","value":80},
{"id":"7ab05a72d3cf2ad68c5816713e086808","key":"Pat","value":69},
{"id":"7ab05a72d3cf2ad68c5816713e087734","key":"Queen","value":89},
{"id":"7ab05a72d3cf2ad68c5816713e0878d9","key":"Roseline","value":93},
{"id":"7ab05a72d3cf2ad68c5816713e087945","key":"Sally","value":62},
{"id":"7ab05a72d3cf2ad68c5816713e0887ee","key":"Trudy","value":71},
{"id":"7ab05a72d3cf2ad68c5816713e08978a","key":"Una","value":80},
{"id":"7ab05a72d3cf2ad68c5816713e08a59f","key":"Victoria","value":79},
{"id":"7ab05a72d3cf2ad68c5816713e08b14e","key":"Willow","value":68}
]}

To include the reduction:

curl -X GET http://127.0.0.1:5984/db/_design/db/_view/stats

which should lead to this report:

{"rows":[
{"key":null,"value":{
   "topScore":94
   ,"bottomScore":56
   ,"sum":1742
   ,"count":23
   ,"mean":75.73913043478261}
   }
]}

Watching the reduction
Looking at the CouchDb logs helps in the understanding of the steps taken by the reduction function:

sudo tail -f /var/log/couchdb/couch.log

Add more document:

curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Al","score":85}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Ben","score":87}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Carl","score":58}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"David","score":62}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Erik","score":71}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Fred","score":75}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Gordon","score":93}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Horton","score":76}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Ivan","score":82}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Jim","score":73}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Kyle","score":75}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Ludvig","score":91}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Mike","score":53}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Nefario(Dr)","score":66}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Oscar","score":80}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Peter","score":69}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Quentin","score":89}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Rob","score":93}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Sam","score":62}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Taylor","score":71}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Ulysse","score":80}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Victor","score":79}'
curl -X POST http://127.0.0.1:5984/db -H 'Content-Type: application/json' -d '{"name":"Walter","score":68}'

Some of the logs show the function used in “reduce” mode:

reduce keys:
   [
      ["Rob","7ab05a72d3cf2ad68c5816713e07c82c"]
     ,["Roseline","7ab05a72d3cf2ad68c5816713e0736c9"]
     ,["Sally","7ab05a72d3cf2ad68c5816713e0741b5"]
     ,["Sam","7ab05a72d3cf2ad68c5816713e07cc19"]
     ,["Taylor","7ab05a72d3cf2ad68c5816713e07cd53"]
     ,["Trudy","7ab05a72d3cf2ad68c5816713e0741b7"]
     ,["Ulysse","7ab05a72d3cf2ad68c5816713e07d97b"]
     ,["Una","7ab05a72d3cf2ad68c5816713e0746bf"]
     ,["Victor","7ab05a72d3cf2ad68c5816713e07e36f"]
     ,["Victoria","7ab05a72d3cf2ad68c5816713e07478c"]
     ,["Walter","7ab05a72d3cf2ad68c5816713e07eb73"]
     ,["Willow","7ab05a72d3cf2ad68c5816713e074906"]
   ]
values:
   [
      93,93,62,62,71,71,80,80,79,79,68,68
   ]
result:{
   "topScore":93
  ,"bottomScore":62
  ,"sum":906
  ,"count":12
  ,"mean":75.5
  }

Some of the logs show the function used in “re-reduce” mode:

rereduce
   keys:null
   values:[
      {
         "topScore":91
        ,"bottomScore":53
        ,"sum":974
        ,"count":13
        ,"mean":74.92307692307692
      },{
         "topScore":94
        ,"bottomScore":58
        ,"sum":2506
        ,"count":33
        ,"mean":75.93939393939394}
   ]
result:{
   "topScore":94
  ,"bottomScore":53
  ,"sum":3480
  ,"count":46
  ,"mean":75.65217391304348
}

Explanation
To help understanding, let’s reproduce the content of the reduce function, here:

function(keys,values,rereduce){
    if( rereduce ) {
        var result = {
            topScore: values[0].topScore
            ,bottomScore: values[0].bottomScore
            ,sum: values[0].sum
            ,count: values[0].count
        };
       
        for(var i=1,e=values.length; i<e; ++i) {
            result.sum = result.sum + values[i].sum;
            result.count = result.count + values[i].count;

            if( result.topScore < values[i].topScore ) {
                result.topScore = values[i].topScore;
            };
            if( result.bottomScore > values[i].bottomScore ) {
                result.bottomScore = values[i].bottomScore;
            };
        };
       
        result.mean = (result.sum / result.count);
   
        log('rereduce keys:'+toJSON(keys)+' values:'+toJSON(values)+' result:'+toJSON(result));
   
        return result;
    };
   
    // Non-rereduce case
    var result = {
        topScore: values[0]
        ,bottomScore: values[0]
        ,sum: values[0]
        ,count: 1
    };
   
    for(var i=1,e=keys.length; i<e; ++i) {
        result.sum = result.sum + values[i];
        result.count = result.count + 1;
       
        if( result.topScore < values[i] ) {
            result.topScore = values[i];
        };
        if( result.bottomScore > values[i] ) {
            result.bottomScore = values[i];
        };
    };
   
    result.mean = (result.sum / result.count);
   
    log('reduce keys:'+toJSON(keys)+' values:'+toJSON(values)+' result:'+toJSON(result));
   
    return result;
}

In “reduce” mode, the parameter “keys” is populated with an array of elements, each element being an association (array) between a key and a document identifier. In that mode, the parameter “values” is an array of values reported by the view. In the example above, the first part of the function is skipped during the “reduce” mode. The last part of the fucntion accepts scalar values and computes top, bottom, sum and count of the scores. Finally, it computes an average over those scores.

As discussed earlier, this result can be the final result, or an intermediate result. It is impossible for the reduce function to predict how the result is to be used.

In “re-reduce” mode, the parameter “keys” is null while the parameter “values” contains a set of intermediate results. In the example above, the first part of the function is used to merge the intermediate results into a new one. This new result could be the final result, or it could be a new intermediate result.

Reduce functions over subset of a View

A reduction does not have to be over the complete set returned by a view. For example, to see only a subset:

curl -X GET 'http://127.0.0.1:5984/db/_design/db/_view/stats?startkey="k"&endkey="n"&reduce=false'

yields only some students:

{"total_rows":46,"offset":20,"rows":[
{"id":"7ab05a72d3cf2ad68c5816713e083c86","key":"Kim","value":75},
{"id":"7ab05a72d3cf2ad68c5816713e08d612","key":"Kyle","value":75},
{"id":"7ab05a72d3cf2ad68c5816713e08de9c","key":"Ludvig","value":91},
{"id":"7ab05a72d3cf2ad68c5816713e0845b6","key":"Lynn","value":91},
{"id":"7ab05a72d3cf2ad68c5816713e084c70","key":"Mary","value":56},
{"id":"7ab05a72d3cf2ad68c5816713e08e00a","key":"Mike","value":53}
]}

If reduction is included:

curl -X GET 'http://127.0.0.1:5984/db/_design/db/_view/stats?startkey="k"&endkey="n"'

then:

{"rows":[
{"key":null,"value":{"topScore":91,"bottomScore":53,"sum":441,"count":6,"mean":73.5}}
]}

Conclusion
Reduce functions can be tricky because of the dual usage. The modes in use are controlled by the CouchDb database and the person designing a reduce function must take into account the various permutations.

NOTE:Do not leave the log statements in view map and reduce functions since they degrade performance.

2010
12.09

This note relates to CouchDb 1.0.1

In CouchDb, documents accessible via a view can be mapped to multiple keys. When querying for multiple keys, it is possible for a document to be returned multiple times. In some circumstances, this might be the desired behaviour. However, when the desired semantics are to retrieve only one copy of each document matching any key, without duplicates, a different approach is required.

As a note of caution, this article might provide a complicated solution to a problem easily solved another way. I was under the impression that the work covered here could be easily done using a special flag on a view query. However, I can not readily find it. I am hoping someone will come around and comment on this article with a simpler approach. Until then, the solution presented here will suffice.

The result of a view query is a JSON object that contains an array of rows, each row reporting a document matching the query. A list function is used to transform the result of a view query into a format desired for output. One advantage of using a list function is that a list function has a chance of inspecting each row (or document) before sending to the output.

In this approach, we use a list function to output a result in the exact same format as a view query, suppressing duplicates of documents that were already sent.

The following list function is generic enough to be used any view that emit the documents as values:

function(head, req) {
    send('{"total_rows":'+head.total_rows
        +',"offset":'+head.offset+',"rows":[');
    var ids = {}
        ,row
        ,first = true
        ;
    while(row = getRow()) {
        if( !ids[row.id] ) {
            if( first ) {
                first = false;
            } else {
                send( ',' );
            };
            send( toJSON(row) );
            ids[row.id] = 1;
        };
    };
    send(']}');
}

The input parameter called “head” is used to retrieve the total number of rows and the offset. Then, the list function outputs the “rows” member. Each row is sent as a JSON string, so the list function must take care of inserting the commas at the right place. A map (javascript object) called “ids” is used recall which documents have already been sent. The key used in the map is the identifier of the document. When a document has already been sent, it is skipped.

For example, if a query to a view named “testview” yielded duplicates of a document using the following URL:
http://127.0.0.1:5984/db/_design/test/_view/testview
then duplicates would be removed if the above function was named “noduplicate” and the following URL employed:
http://127.0.0.1:5984/db/_design/test/_list/noduplicate/testview

In conclusion, the presented function is generic enough to be reused in many situations. However, I suspect that a much easier way to perform this will be designed shortly, if it does not already exist.

2010
12.02

Fix dpkg available file in Ubuntu

This note relates to Ubuntu Maverick Meerkat (10.10) but it might apply to other versions, as well.

I wrote this note after my system became unstable following a number of configuration shenanigans. What did not help is that I had just upgraded from 10.04 to 10.10. Therefore, I am not sure that I can explain how to get to the state my platform was in.

Symptom: Every time an apt-get command is run, some sort of error or warning is reported stating that an available package has a corrupt version number.

Cause: The ‘available’ file used by dpkg contains erroneous information or is corrupted.

Solution: Rebuild the ‘available’ file.

Recipe:

1. Back up current file

sudo cp /var/lib/dpkg/available /var/lib/dpkg/available.broken

2. Delete current ‘available’ file

sudo dpkg --clear-avail

3. Rebuild ‘available’ file

sudo aptitude update

After these steps, commands to ‘apt-get’ should no longer complain about available versions.

2010
12.02

Re-install GNOME-Session in Ubuntu

This note was written while using Ubuntu 10.10. However, it might apply to other versions as well.

Symptoms: After a number of shenanigans involving configuration, I found myself unable to login to the desktop, in Ubuntu. The login screen (GDM) offered my user name. However, once my user name was selected, no session were offered. Entering my password and pressing the login button would show a brief blank screen and then return me to the login screen.

Cause: Somehow, the gnome-session was removed from installation.

Solution: Re-install gnome-session

Here is the recipe:

  1. At the login screen (GDM), press the key combination CTL-ALT-F1. This should drop you out of GDM and into a terminal screen
  2. Login to the terminal using your username and password
  3. At the prompt, enter the command “sudo apt-get install gnome-session”
  4. Then, “sudo reboot”

If the gnome-session was already installed and you get an error attempting to install it again (or the answer “gnome-session is already installed”), then reconfiguring it might suffice: “sudo dpkg-reconfigure gnome-session”

2010
11.24

Installing restricted CODECs is easier in Ubuntu 10.04 than in previous versions.

References:

sudo apt-get install ubuntu-restricted-extras
sudo /usr/share/doc/libdvdread4/install-css.sh
sudo apt-get install ffmpeg gxine vlc banshee faac k9copy
2010
11.23

Installing CouchApp on Ubuntu 10.04

CouchApp is a python tool to help develop, upload and clone applications meant for couchDb. Those applications are also known as “couchApps”.

The following recipe is used to install couchapp on Ubuntu 10.04. To use couchapp, you probably first need to install “couchdb”, but this is readily available from the usual repositories.

The issue in installing couchapp on Ubuntu 10.04 is that one needs to rely on some personal packages made available via launchpad.net.

Warning: This recipe installs keys from developers on your platform. From this point on, your platform will trust packages made available from those individuals.

From a high level view, two packages are required:

  1. add-apt-repository: utility tool to easily add a new repository
  2. couchapp : the python tool itself
  3. python-restkit: a python library that couchapp is dependent on

Installing add-apt-repository

sudo apt-get install python-software-properties

Installing python-restkit

sudo add-apt-repository ppa:bchesneau/couchdbkit
sudo apt-get update
sudo apt-get install python-restkit

Installing couchapp

sudo add-apt-repository ppa:couchapp/couchapp
sudo apt-get update
sudo apt-get install couchapp
2010
10.14

Use a 24-hour clock in Ubuntu

This note applies to:

  • Ubuntu Lucid 10.04
  • Mozilla Thunderbird 3.0.8
  • Mozilla Lightning 1.0b1

References:

I am a fervent user of the 24-hour clock. However, when I install a new platform, I often accept the default locale of en_US.UTF8. In general, I do not mind this locale. However, applications such as Thunderbird use the locale to adjust the display of various elements, including time. It affects also plug-ins such as Lightning.

This note is a receipt that changes the default time display from 12-hour clock to 24-hour clock.

  1. Edit the default locale file
    sudo gedit /etc/default/locale
  2. Add the following line at the end of the default local file:
    LC_TIME="en_DK.UTF-8"
  3. Reboot the computer… (yeah, it is lame)
    sudo reboot

That’s it! From then on, applications that follow the locale will display the time in 24-hour clock format.

To verify that you successfully changed the locale, use the locale command:

locale

The entry LC_TIME=en_DK.UTF-8 should be displayed.