2011
07.06

I have been using a Solid State Drive (SDD) for some time and I have decided to share my notes on how to do it and references I have followed. The configuration I use is to have two disks attached to my platform: the first is the SSD and holds the Operating System; the second is a magnetic drive which holds my home partition.

This note applies to:

  • Ubuntu 11.04
  • ATA INTEL SSDSA2M080G2GC
  • ext4

From research, the important aspects that need to be dealt with when using a SSD are:

  1. Enable TRIM
  2. Disable access timestamp on files
  3. Adjust disk scheduler
  4. Moving log files to RAM drive

Each of these topics are discussed separately below.

Enable TRIM

TRIM is the process under which an operating system tells a drive that a sector is no longer in use. This is not necessary for magnetic drives and, historically, operating systems did not provide this information to the drives as it increased the amount of information traversing between a host and its drives. However, this information is crucial for SSDs as it is needed to perform proper wear leveling.

Information on TRIM can be found via these links:

In Ubuntu, TRIM is enabled via the file “fstab” located in “/etc”. One needs to find the drive associated with the SSD and add the option “discard”. To edit the “fstab”:

Then add the “discard” option to the drive concerned. In my example, the root drive is the SSD. Therefore, the resulting file looks like:

After the “fstab” file is modified, one must reboot the system for the change to take effect.

After reboot, one should ascertain that TRIM was in fact enabled. To do so, a trick provided by Nicolay Doytchev is used (link above). However, it is claimed that this trick works with ext4. One might not get expected results from other disk format.

1. Become root and move to a directory managed by the SSD:

2. Create a file with random bytes in it:

3. Get and record the disk sector address for the beginning of the file:

Note the number under begin_LBA and use it for <ADDRESS> below:

4. Access file data from disk (replace /dev/sdX for the drive you are enabling in fstab):

The result of this should be random bytes and look like this:

5. Delete the file and synchronize the file system so that changes are pushed to the SSD:

6. Finally, repeat the command from step 4. Reading the same disk sector should yield an empty sector. For example:

Disable access time stamps on files and directories

Every time a file is accessed, a time stamp for last access is recorded with the file. It is believed that this information is generally unnecessary and that it is the source of a lot of wear on a SSD. Turning off access time stamps does not disable last modified time stamps, which are crucial to a lot of computer tools.

However, it is possible that some tools you are using rely on last access time stamps. Therefore, I suggest that you disable access time stamps separately than all other changes you make to your system. Turn it off and then leave your system unchanged for a while to see how it performs under this mode. In that period, try out all your tools. Tools that rely on file caching might not do as well as others.

Interesting links about last access time:

Disabling last access time is done by adding the options “noatime” and “nodiratime” for the SSD to the file “fstab” under the directory “/etc”. First, edit the “fstab” file:

Then, locate your drive and add the options “noatime” and “nodiratime” at the appropriate line. For example, my file looks like:

After the file is saved, one must reboot the operating system for the changes to take effect.

After reboot, you can verify that the recording of access time is not longer in effect with the following tests:

1. Move to a directory managed by the SSD and writable to your user. If your home directory is on the SSD, then the following would work:

2. Create a file for testing purposes:

3. Record the last access time:

4. Wait for a minute or two to elapse (you can use ‘date’) and then read the file:

5. Repeat command from step 3 and compare the results. If the same time is returned, then “noatime” is in effect. If a newer time is returned the second time, then the operating system is still recording access time for the file.

6. Clean up after yourself and delete the test file:

Adjust Disk Scheduler

By default, the CFQ scheduler is used to access a drive. This scheduler is designed for magnetic drives and take into account variables such as seek times. In SSDs, access time is fairly constant. Therefore, there is no need for a complex scheduler and addressing requests on a first-come-first-served basis is adequate. I have used the “noop” scheduler and this is what I demonstrate. However, there is an interesting post at this blog that might convince otherwise.

The trick offered here changes the scheduler after the boot is performed and as the operating system is starting it services. Therefore, the boot process does not benefit of these changes. The post above provides links to enable scheduler at the boot loader.

Myself, I like keeping my boot loader as clean as possible, so I have adopted this trick.

1. Edit the file “/etc/rc.local”:

2. Add the following line by taking care to replace sdX for the proper drive:

This change will take effect at the next reboot. However, to save a reboot, one can apply the change right away with (again, replacing sdX for proper drive):

Moving log files to RAM

Log files are a source of constant writing to disk. Personally, I have not yet made the change to my drive for two reasons:

  1. I am ambivalent about throwing away the logs since I sometimes rely on them to find causes of crashes. Given I keep up with the latest version of Ubuntu and that I have a set of problematic video drivers, I feel I must keep my logs around.
  2. Of all the solutions offered, I have not been able to make one work to my liking. I find a lot of solutions presented on the web are finicky and prone to errors.

However, my research indicates that this will yield to an earlier retirement of my drive. Therefore, I offer here a link that I found useful on this topic:

Other readings

Here are a couple of interesting links that might guide you in your decisions. I do not personally endorse the views presented there. However, I believe they provide good food for thoughts:

Conclusion

With a SSD drive, the time to boot up has greatly decreased and performance associated with my development tools as increased. Surprisingly, the general temperature reported by my laptop sensors has also gone down. However, as most activities performed on computers are “web based”, the network-bound activities have remained pretty much the same.

All in all, I have been quite satisfied with my purchase of a SSD drive.

2011
06.09

Tethering iPhone on Ubuntu 11.04

Reference:

  • Ubuntu 11.04
  • iPhone 3GS with iOS 4.3

The following steps can be used to enable tethering between a platform running Ubuntu 11.04 and an iPhone. At a high level:

  1. Enable tethering on iPhone
  2. Install repository from Paul McEnery
  3. Install necessary packages
  4. Connect

Enable Tethering on iPhone

The use of iPhone tethering might be governed by your wireless plan. If data tethering is allowed, one should be able to turn it on using the “Settings” application.
In the “Settings” application, choose “Personal Hotspot” from the menu, and enable it by pushing the switch to “on”.

If one can not accomplish this step, the remainder steps from this article will be in vain.

Install repository from Paul McEnery

Adding package “python-software-properties” provides the command “add-apt-repository”. This command is then used to make it really easy to add a repository from PPA.

Install necessary packages

Once the repository is installed, installing the packages is straight forward:

During the configuration of those packages, a kernel module must be built. In the case that the kernel headers are not installed, the following error message is printed on the screen:

Module build for the currently running kernel was skipped since the kernel source for this kernel does not seem to be installed.

In this case, the proper headers must be installed. To find out which headers are required:

Then, install the headers using the following command. Ensure that you replace the proper version according to what was returned previously.

When this command completes, the iPhone module should be built. If not, then the system can be prompted to rebuild the kernel module:

Connect

Connect the iPhone to the platform using a USB cable. When the USB device is detected, an Ethernet should be established automatically.

2011
04.25

Materialize Maven Modules in Eclipse

This note relates to:

  • Eclipse SDK 3.6.2
  • m2eclipse plugin 0.12.1
  • subclipse 1.6.17

In Eclipse, modules in a Maven project are materialized separately when the project is first checked out of a Subversion repository. However, when new modules are loaded as a result of a repository update, the new modules are not automatically materialized.

To prompt Eclipse to create new projects for the acquired modules:

  1. Right-click on the project containing the new modules
  2. Select Import… from menu
  3. Select Maven > Existing Maven Projects and press Next button
  4. Set the check boxes next to the projects to materialize and press Finish button
2011
04.25

This note relates to Eclipse SDK 3.6.2.

When editing a Java source file, in Eclipse, warnings are reported when an import statement is specified but not used. Unused import statements can be removed automatically by using the Java editor menu entry Source > Organize Imports.

This can be automatized when the file is saved using the following approach:

  1. Open “Preferences” page via menu Window > Preferences
  2. Navigate to Save Actions for Java Editor (Java > Editor > Save Actions)
  3. Select Perform the selected actions on save
  4. Select Organize imports
  5. Press button Apply

From this point on, when a Java file is saved, import statements will automatically be cleaned.

2011
04.05

Better Passwords with a Reasonable Effort

In this note, principles of good password creation are offered and discussed. At the end of the note, a process is offered to create passwords that follow the presented principles and require a reasonable mental effort.

Many “easy” tricks are offered over the web. It is up to the reader to analyze those approaches against the principles offered here.

Related topics:

Principles

Characteristics of a good password:

  • a password should be used only once for a given purpose
  • the compromise of one password should not compromise other passwords
  • a password should contain a large amount of entropy

Password used only once:
It is necessary to use a password only once. This is important in case one of your password is compromised. For example, let’s say you have an account with two services (Google and Facebook). If the same password is employed for both services, then an attacker who becomes aware of the password for one service can also access the other service.

Password compromise:
Passwords should not be related in such a way that knowledge of a password for one service reveals passwords for other services. For example, although the following passwords “123google”, “123facebook” and “123amazon” are different, an attacker discovering one would easily guess the others.

Entropy:

Entropy represents the amount of chaos associated with an entity. In the case of passwords, entropy is related to the number of tests an attacker would have to conduct to test all the possible passwords.

For example, if a bike lock was made of 4 numbers ranging from 1 to 8, then the total number of possible combinations would be 8 * 8 * 8 * 8 = 4096 combinations. Therefore, an attacker without knowledge of the proper combination would have to try at most 4096 different combinations to unlock the bike.

Entropy is expressed in the number of bits required to hold the total number of combinations. In the bike lock example, 4096 can be expressed in a value with 12 bits. This can be verified since 2 to the power of 12 is 4096. Each rotating wheel in the bike lock contribute 3 bits of entropy since each wheel can have 8 different position (2 ^ 3 = 8). If the same lock had only three wheels, the lock would provide only a total of 9 bits of entropy.

In the bike example, if each wheel could occupy 10 positions, instead of 8, then each wheel would provide 3.32 bits of entropy (2 ^ 3.32 = 10) and a bike lock made of four of those wheels would provide a total of 13.3 bits. Indeed, the lock provides 10 000 different combinations, which is 2 ^ 13.3.

What needs to be remembered from this exercise are the following concepts:

  • each position in a lock provides an amount of entropy
  • the larger number of combinations in a position means a larger amount of entropy
  • the total amount of entropy provided by a lock is the sum of entropy provided by each independent position in the lock

Passwords are similar to Bike Locks

A password is similar to a bike lock where each character in the password represents a mechanical wheel that can take a number of different values.

If a password is made only of lowercase letter (26 values), then each character is worth 4.7 bits of entropy.
If a password is made of lower and upper case letters (52 values), then each character is worth 5.7 bits of entropy.
If a password is made of lower and upper case letters, along with numbers and special characters (72 values), then each character is worth 6.2 bits of entropy.

To be safe, a password should have at least 64 bits of entropy. A great password should have 128 bits of entropy.

A question easily comes to mind: should each password be at least 11 characters? The answer is yes. Continue reading, a trick is given below on how to create long passwords and easily remember them.

Entropy Revisited

Astute readers might offer a password made out of words such as:

OrangeApple

There are 11 characters in this password. Since characters range in lower and upper cases, then each character offers 5.7 bits of entropy, which would mean 62.7 bits of entropy, right? No. If a password is made out of words, then the characters are related and not independently random. Therefore, an attacker trying words and not characters might find the password in a smaller amount of tries.

There are 171,476 English words in current use (between 17 and 18 bits of entropy for each word), so OrangeApple is worth only 34 bits of entropy, not 62.7.

Password Principles Restated

Characteristics of a good password:

  • a password should be used only once for a given purpose
  • the compromise of one password should not compromise other passwords
  • a password should contain a large amount of entropy (minimum 64 bits, better if 128 bits and above):
    • many characters in the password
    • varied characters (lower and upper cases, numbers, special characters)
    • unrelated characters (avoid whole words)

Mental Approach to Creating Better Passwords

Create passwords by combining the following tricks:

  • high entropy constant reused between all passwords to provide a minimum amount of entropy
  • add a variable component based on the context the password is used
  • use a mental transformation known only to you

Constant Component
Create a long string of seemingly random characters by using a saying you want to repeat yourself. Use a phrase that means something to you. Make it a message that will improve your life, since you will type it all the time. For example:

RoYoTiEv8Ki!

The above string of characters can easily be remembered if one uses the phrase: “Rotate Your Tires Every 8000 Kilometers!”. This component yields approximately 72 bits of entropy.

Variable component
This is the easiest part. The variable component should be based on the context in which the password is used. If this is a password for Google, then the variable part could be “google”. If the password is used to unlock your laptop, then the variable part could be “laptop”. The variable component should be easy for the password owner to recall from the context in which the password is used.

Mental Transformation
The aim of the mental transformation is to hide the variable part from a potential attacker. Continuing the examples above:

Password for Facebook: AfRoYoTiEv8Ki!cebook

Password for Google: OgRoYoTiEv8Ki!ogle

In these examples, the mental transformation consists of:

  • taking the first two letters of the variable component, reversing them and putting them in front of the constant component; and,
  • taking the remainder of the variable component and appending after the constant component.

In these examples, the entropy provided by the password is always at least 72 bits. The approach follows the presented password principles and, with a bit of practice, requires little mental effort.

Each reader should find for himself/herself a suitable mental transformation which is personal and original.

2010
12.19

Writing a Reduce Function in CouchDb

This note relates to:

  • CouchDb version 1.0.1
  • curl version 7.21.0
  • Ubuntu 10.10

References:

This article discusses some of the details in writing a reduce function for a CouchDb view. A reduce function is used to perform server-side operations on a number of rows returned by a view without having to send the rows to the client, only the end result.

The tricky part of reduce functions is that they must be written to handle two “modes”: reduce and re-reduce. The signature of a reduce function is as follows:

If the parameter “rereduce” is reset (false), then the function is called in a “reduce” mode. If the parameter “rereduce” is set (true), then the function is called in a “re-reduce” mode.

The aim of a reduce function is to return one value (one javascript entity, a scalar, a string, an array, an object…) that represents the result of an operation over a set of rows selected by a view. Ultimately, the result of the reduce function is sent to the client.

The reason for the two modes is that the reduce function is not always given at once all the rows that the operation must be performed over. For efficiency reasons, including caching and reasons related to database architecture, there are circumstances where the operation is repeated over subsets of all rows, and then these results are combined into a final one.

The “reduce” mode is used to create a final result when it is called over all the rows. When only a subset of rows are given in the “reduce” mode, then the result is an intermediate result, which will be given back to the reduce function in “re-reduce” mode.

The “re-reduce” mode can be called once or multiple times with intermediate results to produce the final result.

Therefore, the tricky part of reduce function is to write them in such a way that:

  1. the keys and values from a view can be accepted as input
  2. the result must be convenient as the output for the client
  3. the result of the reduce function must be accepted as input in the case of “re-reduce”

The remainder of this note is an example of a reduce function that computes simple statistics over a set of scores. The example follows these steps:

  1. Create a database in CouchDb
  2. Install a design document with the map and reduce function that is tested
  3. Load a number of documents, which are score results
  4. Request the reduction to access the expected statistics

In this example, it is assumed that the CouchDb database is located at http://127.0.0.1:5984. Also, it is assumed that there are no assigned administrators (anyone can write to the database).

Create Database

curl is used to perform all operations.

Install Design Document

Create a text file named “design.txt” with the following content:

Load design document:

Load Documents

Consume View and Reduction
To see the output of the view:

The following result should be reported:

To include the reduction:

which should lead to this report:

Watching the reduction
Looking at the CouchDb logs helps in the understanding of the steps taken by the reduction function:

Add more document:

Some of the logs show the function used in “reduce” mode:

Some of the logs show the function used in “re-reduce” mode:

Explanation
To help understanding, let’s reproduce the content of the reduce function, here:

In “reduce” mode, the parameter “keys” is populated with an array of elements, each element being an association (array) between a key and a document identifier. In that mode, the parameter “values” is an array of values reported by the view. In the example above, the first part of the function is skipped during the “reduce” mode. The last part of the fucntion accepts scalar values and computes top, bottom, sum and count of the scores. Finally, it computes an average over those scores.

As discussed earlier, this result can be the final result, or an intermediate result. It is impossible for the reduce function to predict how the result is to be used.

In “re-reduce” mode, the parameter “keys” is null while the parameter “values” contains a set of intermediate results. In the example above, the first part of the function is used to merge the intermediate results into a new one. This new result could be the final result, or it could be a new intermediate result.

Reduce functions over subset of a View

A reduction does not have to be over the complete set returned by a view. For example, to see only a subset:

yields only some students:

If reduction is included:

then:

Conclusion
Reduce functions can be tricky because of the dual usage. The modes in use are controlled by the CouchDb database and the person designing a reduce function must take into account the various permutations.

NOTE:Do not leave the log statements in view map and reduce functions since they degrade performance.

2010
12.09

This note relates to CouchDb 1.0.1

In CouchDb, documents accessible via a view can be mapped to multiple keys. When querying for multiple keys, it is possible for a document to be returned multiple times. In some circumstances, this might be the desired behaviour. However, when the desired semantics are to retrieve only one copy of each document matching any key, without duplicates, a different approach is required.

As a note of caution, this article might provide a complicated solution to a problem easily solved another way. I was under the impression that the work covered here could be easily done using a special flag on a view query. However, I can not readily find it. I am hoping someone will come around and comment on this article with a simpler approach. Until then, the solution presented here will suffice.

The result of a view query is a JSON object that contains an array of rows, each row reporting a document matching the query. A list function is used to transform the result of a view query into a format desired for output. One advantage of using a list function is that a list function has a chance of inspecting each row (or document) before sending to the output.

In this approach, we use a list function to output a result in the exact same format as a view query, suppressing duplicates of documents that were already sent.

The following list function is generic enough to be used any view that emit the documents as values:

The input parameter called “head” is used to retrieve the total number of rows and the offset. Then, the list function outputs the “rows” member. Each row is sent as a JSON string, so the list function must take care of inserting the commas at the right place. A map (javascript object) called “ids” is used recall which documents have already been sent. The key used in the map is the identifier of the document. When a document has already been sent, it is skipped.

For example, if a query to a view named “testview” yielded duplicates of a document using the following URL:
http://127.0.0.1:5984/db/_design/test/_view/testview
then duplicates would be removed if the above function was named “noduplicate” and the following URL employed:
http://127.0.0.1:5984/db/_design/test/_list/noduplicate/testview

In conclusion, the presented function is generic enough to be reused in many situations. However, I suspect that a much easier way to perform this will be designed shortly, if it does not already exist.

2010
12.02

Fix dpkg available file in Ubuntu

This note relates to Ubuntu Maverick Meerkat (10.10) but it might apply to other versions, as well.

I wrote this note after my system became unstable following a number of configuration shenanigans. What did not help is that I had just upgraded from 10.04 to 10.10. Therefore, I am not sure that I can explain how to get to the state my platform was in.

Symptom: Every time an apt-get command is run, some sort of error or warning is reported stating that an available package has a corrupt version number.

Cause: The ‘available’ file used by dpkg contains erroneous information or is corrupted.

Solution: Rebuild the ‘available’ file.

Recipe:

1. Back up current file

2. Delete current ‘available’ file

3. Rebuild ‘available’ file

After these steps, commands to ‘apt-get’ should no longer complain about available versions.

2010
12.02

Re-install GNOME-Session in Ubuntu

This note was written while using Ubuntu 10.10. However, it might apply to other versions as well.

Symptoms: After a number of shenanigans involving configuration, I found myself unable to login to the desktop, in Ubuntu. The login screen (GDM) offered my user name. However, once my user name was selected, no session were offered. Entering my password and pressing the login button would show a brief blank screen and then return me to the login screen.

Cause: Somehow, the gnome-session was removed from installation.

Solution: Re-install gnome-session

Here is the recipe:

  1. At the login screen (GDM), press the key combination CTL-ALT-F1. This should drop you out of GDM and into a terminal screen
  2. Login to the terminal using your username and password
  3. At the prompt, enter the command “sudo apt-get install gnome-session”
  4. Then, “sudo reboot”

If the gnome-session was already installed and you get an error attempting to install it again (or the answer “gnome-session is already installed”), then reconfiguring it might suffice: “sudo dpkg-reconfigure gnome-session”

2010
11.24

Installing restricted CODECs is easier in Ubuntu 10.04 than in previous versions.

References: