2016-04-06

How to disable (reject) any root password on Debian and Ubuntu

This blog post explains how to disable (reject) any root password on Debian and Ubuntu, thus rejecting login attempts as root. Becoming root with sudo (by typing the calling user's password) or ssh (using a public key) remains possible.

TL;DR Run as root: passwd -d -l root

How to become root if password-based root logins are (or will be) disabled?

Before disabling password-based root logins, make sure you have other ways to become root. One possible way is running sudo (without arguments) from a non-root user. To make this work, first you have to install sudo by running (without the leading #) as root:

# apt-get install sudo

as root. (Ubuntu systems come with sudo preinstalled, Debian systems don't have it by default.) Then run as root, replacing MyUser with your non-root login name:

# adduser MyUser sudo

After running this, running sudo as that user will ask for the user's password (not the root password), and when typed correctly, you will get a root shell, and will be able to run commands as root. (Type exit to exit from the root shell.)

An alternative to sudo for becoming root without a password is running ssh root@localhost. For that you need a properly configured sshd (with PermitRootLogin without-password or PermitRootLogin yes in /etc/ssh/sshd_config), creating an SSH key pair and appending the public key to /root/.ssh/authorized_keys. If you need help setting this up or using it, then please ask a Unix or Linux guru friend.

How to disable password-based root logins

To disable (reject) any root password on Debian and Ubuntu, run this (without the leading #) as root:

# passwd -d -l root

This effectively changes the 2nd field line starting with root: in /etc/shadow to !, thus the line will start with root:!:, making login, su, ssh (when using password authentication, i.e. no public key) reject login attempts as root. Typically the password wouldn't even be asked for, but if it is, any password would be rejected. An alternative to the command above is editing the /etc/shadow file manually (as root), and adding the !. Also the -d flag is not necessary, without it the password hash is still kept in /etc/shadow (but a ! is prepended to disable it).

Ubuntu comes with this default (root:!: in /etc/shadow), Debian doesn't.

If you want to disable the root password in ssh only (and allow password-based root logins in login and su), then instead of running the command above, add (or change) the line

PermitRootLogin without-password

to /etc/ssh/sshd_config (as root), and then run (as root):

# /etc/init.d/ssh restart

Please note that there are ways to permit a root login without a password (or with an empty password), but this is very bad security practice, so this blog post doesn't explain how to do it.

How to enable password-based root logins

To enable password-based root logins again, run this as root:

# passwd root

It will ask you to specify the new password for root.

2016-02-22

keybase.txt

Please disregard this post, it's a cryptographic proof checked by keybase.io/pts.

==================================================================
https://keybase.io/pts
--------------------------------------------------------------------

I hereby claim:

  * I am an admin of https://ptspts.blogspot.com
  * I am pts (https://keybase.io/pts) on keybase.
  * I have a public key with fingerprint 537D 2E8D 6FAB 3265 9A1F  8767 33BB 974C 2FE0 93F2

To do so, I am signing this object:

{
    "body": {
        "key": {
            "eldest_kid": "01110f1fe32d1d6263e9674e11d7249ac66f46d9f8c54c896c16205dcf68203493b90a",
            "fingerprint": "537d2e8d6fab32659a1f876733bb974c2fe093f2",
            "host": "keybase.io",
            "key_id": "33bb974c2fe093f2",
            "kid": "01110f1fe32d1d6263e9674e11d7249ac66f46d9f8c54c896c16205dcf68203493b90a",
            "uid": "51f5f6cc0a558175f85f29dc164fe900",
            "username": "pts"
        },
        "service": {
            "hostname": "ptspts.blogspot.com",
            "protocol": "https:"
        },
        "type": "web_service_binding",
        "version": 1
    },
    "ctime": 1456178821,
    "expire_in": 157680000,
    "prev": "b14bb983d61e5b2097e313d7b246fa7e17030b598d6ae6a7a1545f36b5ef75c2",
    "seqno": 27,
    "tag": "signature"
}

which yields the signature:

-----BEGIN PGP MESSAGE-----
Version: GnuPG v1

owGtUj1rVEEUfSYaMSCYxk6L6dRlMx9vvhZsLLQRtBAbi2U+7mwe2bx5vveyMYRt
JYWFIGil4t9QG20sUu4fEG20CtrYWDizKDaWDgPDvXPOmTP33sdnV4uVDfbm2c3N
j0++nzj6aou7Rw8PD5CNfh+NDtA2LA+Yeuj68Xbl0QhhQggOJACjnnhBBQMtZAmE
eElLbZwQoRReB+V46ZQWjgiKuXdBKIpZqZnV2KABClU9gbZpq7pPspxJT0F5EYxl
VHBtSFBSSMas1bJ0NADWLNBE3IpdZiRz1nQwrGLKpWC8tPcP/H/2vbuU4yTwIJzD
hnNFJA+KB6p9IpUBNMYZ2EFbmx1I6Kbv0HyAUmJWOcg1zZ/4e5n20E7jpGtiP3Rx
J7GbNvbRxWkCbPV9042yQL/fZMYe2PFvrbGtap8qmRgzaLsq1mhEEtL1VRYnJRdE
KkXJAMGDpmphXGUEl0LhtPI7MEuSlpTW3tOKeUGAW4q1BEaYl5aWqScSiMQMW65T
iwwIIw3hJQ9MWA5BcpcL3cH9OqIRlcmomSTRrprUpt9tAc3XD6+fLDZWirVTK3nG
ivUz5/4MXnereH7xx+3zcfH05+LT26ub7+68evR+/q14celaWDv+fGXR3Fg9Xny5
fPrlh9cXfgE=
=N5V9
-----END PGP MESSAGE-----

And finally, I am proving ownership of this host by posting or
appending to this document.

View my publicly-auditable identity here: https://keybase.io/pts

==================================================================

micropython for Linux i386, statically linked

This blog post is to announce statically linked binaries for Linux i386 of MicroPython.

MicroPython (Python for microcontrollers) is an open source reimplementation (see sources on GitHub) of the Python 3.4 language for microcontrollers with very little RAM (as low as 60 kB). The CPython interpreter is not used at all, MicroPython has a completely separate implementation in C, supporting the full Python 3.4 language syntax, but with a much smaller standard library (i.e. much fewer modules and classes, and existing classes have fewer methods). Unicode strings (i.e. the str class) are supported though.

MicroPython can be cross-compiled to many different platforms, including multiple microcontrollers (including the ESP8266 ($5) and the pyboard ($40)) and to Unix systems (including Linux). The micropython binary seems to be 17.56 times smaller than the python binary for Linux i386 (both binaries were statically linked against uClibc using https://github.com/pts/pts-clang-xstatic/blob/master/README.pts-xstatic.txt, and optionally compressed with UPX). The detailed file sizes are:

The script for recompiling MicroPython for Linux i386, statically linked is also open source.

Please note that neither StaticPython nor MicroPython open any external files (such as .so or .py or .zip) when starting up, all the Python interpreter and the Python standard library (and the libc as well) are statically linked in to the binary executable.

2016-01-06

How to extract comments from a JPEG file

This blog post explains how to extract comments from a JPEG file. Each JPEG file consists of segments. Each segment describes parts of the image data or metadata. The comments are in segments with marker COM (0xfe), there can be any number of them, anywhere (usually before the SOS segment) in the file.

Use the rdjpgcom command-line tool to extract comments. The tool is part of libjpeg, and on Ubuntu and Debian systems it can be installed with (don't type the leading $):

$ sudo apt-get install libjpeg-progs

Once installed, use it like this to print all comments in the JPEG file, with a terminating newline added to each:

$ rdjpgcom file.jpg

If the file doesn't have any comment, the output of rdjpgcom is empty. Here is how to add comments:

$ wrjpgcom -c 'COMfoo' com0.jpg >com1.jpg
$ wrjpgcom -c 'COMbar' com1.jpg >com2.jpg
After adding the comments, it will look like this:
$ rdjpgcom com2.jpg
COMfoo
COMbar

If you also want to see the unprintable characters (unsafe on a terminal), pass the -raw flag:

$ rdjpgcom -raw file.jpg

If you need a library, use the following C code, which is a minimalistic reimplementation of rdjpgcom -a:

/*
 * getjpegcom.c: Get comments from JPEG files.
 * by pts@fazekas.hu at Wed Jan  6 20:48:07 CET 2016
 *
 *   $ gcc -W -Wall -Wextra -Werror -s -O2 -o getjpegcom getjpegcom.c &&
 *   $ ./getjpegcom <file.jpg
 */

#include <stdio.h>
#if defined(MSDOS) || defined(WIN32)
#include <fcntl.h>  /* setmode. */
#endif

/* Get and print all comments in a JPEG file. Comments are written to of,
 * with a newline appended as terminator.
 *
 * Returns 0 on success, or a negative number on error.
 */
static int get_jpeg_comments(FILE *f, FILE *of) {
  int c, m;
  unsigned ss;
#if defined(MSDOS) || defined(WIN32)
  setmode(fileno(f), O_BINARY);
  setmode(fileno(of), O_BINARY);
#endif
  if (ferror(f)) return -1;
  /* A typical JPEG file has markers in these order:
   *   d8 e0_JFIF e1 e1 e2 db db fe fe c0 c4 c4 c4 c4 da d9.
   *   The first fe marker (COM, comment) was near offset 30000.
   * A typical JPEG file after filtering through jpegtran:
   *   d8 e0_JFIF fe fe db db c0 c4 c4 c4 c4 da d9.
   *   The first fe marker (COM, comment) was at offset 20.
   */
  if ((c = getc(f)) < 0) return -2;  /* Truncated (empty). */
  if (c != 0xff) return -3;
  if ((c = getc(f)) < 0) return -2;  /* Truncated. */
  if (c != 0xd8) return -3;  /* Not a JPEG file, SOI expected. */
  for (;;) {
    /* printf("@%ld\n", ftell(f)); */
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    if (c != 0xff) return -3;  /* Not a JPEG file, marker expected. */
    if ((m = getc(f)) < 0) return -2;  /* Truncated. */
    while (m == 0xff) {  /* Padding. */
      if ((m = getc(f)) < 0) return -2;  /* Truncated. */
    }
    if (m == 0xd8) return -4;  /* SOI unexpected. */
    if (m == 0xd9) break;  /* EOI. */
    if (m == 0xda) break;  /* SOS. Would need special escaping to process. */
    /* printf("MARKER 0x%02x\n", m); */
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    ss = (c + 0U) << 8;
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    ss += c;
    if (ss < 2) return -5;  /* Segment too short. */
    for (ss -= 2; ss > 0; --ss) {
      if ((c = getc(f)) < 0) return -2;  /* Truncated. */
      if (m == 0xfe) putc(c, of);  /* Emit comment char. */
    }
    if (m == 0xfe) putc('\n', of);  /* End of comment. */
  }
  return 0;
}

int main(int argc, char **argv) {
  (void)argc; (void)argv;
  return -get_jpeg_comments(stdin, stdout);
}

Here is how to compile and run it:

$ gcc -W -Wall -Wextra -Werror -s -O2 -o getjpegcom getjpegcom.c
$ ./getjpegcom com2.jpg
COMfoo
COMbar

2015-12-05

Announcing flickrurlget: Flickr photo downloader from command-line in batch

This blog post is an announcement of flickrurlget, a command-line tool for Unix, written in Python 2.x that can be used to download photos from Flickr in batch. flickrurlget itself doesn't download photos, but it generates a list of raw photo URLs which can be downloaded with a download manager (even with `wget -i').

Download and install from source.

There are many photo download tools for Flickr (including command-line tools, GUI tools, Firefox extensions and Chrome extensions), none of which I tried having the feature set of flickrurlget:

  • Can get highest-resolution (if possible, original) image URLs in:
    • a Flickr photo
    • a photostream of a Flickr user (i.e. all photos of the user)
    • the favorite photos of a Flickr user
    • an album (photoset) of a Flickr user
    • a Flickr group
    • a gallery of a Flickr user
  • Batch operation: started from the command-line with a bunch of Flickr page URLs, and it discovers all the photos in there without any interaction.
  • Can get anybody's photos (not only your own).
  • Can get non-public and adult (non-safe) photos as well (after loggig in).
  • Doesn't need a GUI or a web browser for most of the operations.

There is the command-line tool flickr_download (source code and PyPi projet page), which is also written in python. Here is how flickrurlget is better than flickr_download:

  • Takes Flickr page URLs and figures out all parameters automatically.
  • Can download groups and galleries (in addition to user photostreams and photosets) as well.
  • Has user ID, group ID and photoset ID discovery: the user doesn't have to do manual ID lookups before running the tool.
  • Uses only standard Python modules, no other dependencies.
  • Better compatibility with old Python: works with 2.7, 2.6, 2.5 and 2.4; not only 2.7.

Flickr has a nice, full-featured REST API (works with both XML and JSON), its documentation is available here. flickrurlget uses this API. It can also optionally OAuth to log the user in.

2015-11-27

How to compute the intersection of two sorted lists (in Python)

This blog post explains how to compute the sorted intersection of two sorted lists, and it shows a fast Python implementation. The time complexity is O(min(n + m, n · log(m)), where n is the minumum of the list lengths, and m is the maximum of the list lengths.

The first idea is to take the first element of both lists, and, if they are different, discard the smaller one of the two. (The smaller one can't be in the intersection because it's smaller than all elements of the other list.) If the first elements were equal, then emit the value as part of the intersection, and discard both first elements. Continue this until one of the lists run out. If discarding is implemented by incrementing index variables, this method finishes in O(n + m) time, because each iteration discards at least one element.

We can improve on the first idea by noticing that it's possible to discard multiple elements in the beginning (i.e. when the two first elements are different): it's possible to discard all elements which are smaller than the larger one of the two first elements. Depending on the input, this can be a lot of elements, for example if all elements of the first list are smaller than all elements of the second list, then it will discard the entire first list in one step. In the general case, binary search can be used figure out how many elements to discard. However, binary search takes logarithmic time, so the total execution time is O(m · log(m)), which can be faster or slower than the O(n + m) of the previous solution. In fact, by more careful analysis of the number of runs (blocks which are discarded), it's possible to show that the execution time is just O(n · log(m)), but that can still be slower than the previous solution.

It's possible to combine the two solutions: estimate the execution time of the 2nd solution, and if the estimate is smaller than n + m, execute the 2nd solution, otherwise execute the first solution. Please note that the estimate also takes into account the constants (not only big-O). The resulting Python (≥2.4 and 3.x) code looks like:

def intersect_sorted(a1, a2):
  """Yields the intersection of sorted lists a1 and a2, without deduplication.

  Execution time is O(min(lo + hi, lo * log(hi))), where lo == min(len(a1),
  len(a2)) and hi == max(len(a1), len(a2)). It can be faster depending on
  the data.
  """
  import bisect, math
  s1, s2 = len(a1), len(a2)
  i1 = i2 = 0
  if s1 and s1 + s2 > min(s1, s2) * math.log(max(s1, s2)) * 1.4426950408889634:
    bi = bisect.bisect_left
    while i1 < s1 and i2 < s2:
      v1, v2 = a1[i1], a2[i2]
      if v1 == v2:
        yield v1
        i1 += 1
        i2 += 1
      elif v1 < v2:
        i1 = bi(a1, v2, i1)
      else:
        i2 = bi(a2, v1, i2)
  else:  # The linear solution is faster.
    while i1 < s1 and i2 < s2:
      v1, v2 = a1[i1], a2[i2]
      if v1 == v2:
        yield v1
        i1 += 1
        i2 += 1
      elif v1 < v2:
        i1 += 1
      else:
        i2 += 1

The numeric constant 1.4426950408889634 in the code above is 1/math.log(2).

The code with some tests and with support for merging multiple sequences is available on GitHub here.

Related questions on StackOverflow: this and this.

2015-03-20

How to use rsync over USB on Android with adb

This blog post explains how to copy files using rsync between the computer running Unix (typically Linux or Mac OS X) and the mobile device running Android, using an USB cable. It's not necessary to root the device. It's not necessary to install any app.

If you want to copy over wifi rather than USB, then please use the app rsync backup for Android (rsync4android) instead. The rest of this tutorial describes a method which needs the computer and the mobile device connected with a USB data cable.

Enable USB debugging on the device. If you don't know how to do it in Settings, then find a tutorial online.

Install adb (Android Debug Bridge) to the Unix system. For example, on Ubuntu: sudo apt-get install android-tools-adb

Connect the mobile device to the computer. Run: adb shell id on the computer. (All commands should be run on the computer unless asked otherwise.) It should display something like:

$ adb shell id
uid=2000(shell) gid=2000(shell) groups=1004(input),... context=u:r:shell:s0

If it doesn't work, you may have to enable Settings / Developer options / USB debugging on the device, then reconnect, then click OK on the dialog box in the device, then rerun adb shell id on the computer.

Please note that on Cyanogenmod rsync is installed by default and it is on the $PATH, so you can skip some of the steps below. (If you don't know what to skip, just do everything anyway.)

Download the rsync binary: wget -O rsync.bin http://github.com/pts/rsyncbin/raw/master/rsync.rsync4android

Copy the rsync binary to the device: adb push rsync.bin /data/local/tmp/rsync

Make the rsync binary on the device executable: adb shell chmod 755 /data/local/tmp/rsync

Make sure you have a backup copy of the binary in a more permanent directory: adb shell cp /data/local/tmp/rsync /sdcard/rsync.bin

Get the rsync version by running adb shell /data/local/tmp/rsync --version . Typical output:

$ adb shell /data/local/tmp/rsync --version
rsync  version 3.1.1  protocol version 31
Copyright (C) 1996-2014 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 32-bit timestamps, 64-bit long ints,
    no socketpairs, hardlinks, symlinks, no IPv6, batchfiles, inplace,
    append, no ACLs, no xattrs, no iconv, no symtimes, no prealloc

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

Run rsync --version . If you get something like

rsync: command not found

, then rsync isn't installed to the computer. On Ubuntu you can install it with sudo apt-get install rsync

Create an rsyncd.conf config file and install it to the device by running: adb shell 'exec >/sdcard/rsyncd.conf && echo address = 127.0.0.1 && echo port = 1873 && echo "[root]" && echo path = / && echo use chroot = false && echo read only = false' . This must finish without displaying any (error) message.

Start the rsync daemon in the device by running: adb shell /data/local/tmp/rsync --daemon --no-detach --config=/sdcard/rsyncd.conf --log-file=/proc/self/fd/2 . It must start up with just a single message:

2015/03/19 15:58:14 [5814] rsyncd version 3.1.1 starting, listening on port 1873

Keep it running for the duration of the copies (below), and continue working in another terminal window. Or press Ctrl-C to exit right now, and restart it (so it will start running in the background on the device) like this: adb shell '/data/local/tmp/rsync --daemon --config=/sdcard/rsyncd.conf --log-file=/data/local/tmp/foo &'

Start port forwarding by running: adb forward tcp:6010 tcp:1873

Now you can start copying files with rsync (back and forth). An example command: rsync -av --progress --stats rsync://localhost:6010/root/sdcard/Ringtones .

You may find the --size-only flag useful if rsync is copying the same files over and over again.

You may want to copy from or to /storage/sdcard1 instead of /sdcard on the device.

Some newer storage devices have the exfat filesystem (older ones typically have fat or some emulation of it, and that's just fine). Writing to exfat drives rsync -av crazy: it reports steady progress with lots of Operation not permitted errors, but it actually doesn't create any files. This applies both rsync running on the Linux computer and rsync running on the device. A solution is to replace rsync -av with rsync -vrtlD , and restart the copy.

2015-03-03

How to avoid data copies with move semantics in C++11

This blog post explains how to avoid data copies in assignment from temporary values in C++. The move assignment operator (a feature introduced in C++11) will be defined for the class, and it will get called instead of the copy assignment operator, and the copy will be avoided.

Let's consider std::string, a type whose values are expensive to copy (assuming that the implementation copies the entire string data, not just a pointer to t buffer). Both the copy constructor and the copy assignment operator (operator=) copy the old data from the new data, like this for the copy assignment operator:

#include <string.h>
namespace std {
string &operator=(const string &other) {
  resize(other.size());
  memcpy(&(*this)[0], other.data(), other.size() + 1);
  return *this;
}
}

Let's assume that we have a function which returns a string: std::string GetUserName();. We can call this function and save the result to a variable: std::string user_name = GetUserName();. (It also works the same way with const in the beginning.) How many times does the value have to be copied until it lands in the variable user_name? Most modern compilers do the return value optimization to avoid all copies (so no copy constructor and no copy assignment operator is run). But if we already have the variable std::string user_name; and we want the assignment user_name = GetUserName(); avoid copies, then we need to define a move assignment operator (taking an rvalue reference (&&) argument instead of a const reference (const&) argument), and the assignment above will use the move assignment operator, which is faster than the copy assignment operator, because it can steal the resources from the source. An example implementation:

#include <string.h>
namespace std {
string &operator=(string &&other) {
  capacity_ = other.capacity_;
  size_ = other.size_;
  data_ = other.data_;  // Copies just the pointer.
  other.capacity_ = other.size_ = 0;
  other.data_ = nullptr;
  return *this;
}
}

There is also a corresponding move constructor which can be called instead of the copy constructor to avoid the copy. It works even if the return value optimization cannot be applied (e.g. when the function body has both return a; and return b;).

Let's see a more detailed example which has all these:

  • copy constructor (*C): C(const C&)
  • move constructor (&C): C(&&) (only for C++11)
  • copy assignment operator (=C): C &operator(const C&)
  • move assignment operator (#C): C &operator(C &&) (only for C++11)
#include <stdio.h>

class C {
 public:
  explicit C(int v) { printf("+C %d\n", v); }
  ~C() { puts("~C"); }
  C(const C&) { puts("*C"); }
  C &operator=(const C&) { puts("=C"); return *this; }
#if __GXX_EXPERIMENTAL_CXX0X__ || __cplusplus >= 201100
  C(C&&) { puts("&C"); }
  C &operator=(C &&) { puts("#C"); return *this; }
#endif
};

static inline C C10(int v) {
  return C(v * 10);
}

int main(int argc, char **argv) {
  (void)argc; (void)argv;
  C ca = C10(11);
  puts("---R1");
  C cb(22);
  puts("---R2");
  cb = C10(33);
  puts("---R3");
  return 0;
}

We can compile it for C++98 (older C++ standard than C++11) and run it:

$ g++ -W -Wall -Wextra -Werror -s -O2 -ansi -pedantic test_assign_with_move_semantics.cc && ./a.out
+C 110
---R1
+C 22
---R2
+C 330
=C
~C
---R3
~C
~C

And for C++11:

$ g++ -W -Wall -Wextra -Werror -s -O2 -std=c++0x -pedantic test_assign_with_move_semantics.cc && ./a.out
+C 110
---R1
+C 22
---R2
+C 330
#C
~C
---R3
~C
~C

The only difference is =C has changed to #C when C++11 features were enabled. That's because the body of the #if in the code gets compiled only for C++11 and above, and this body contains the move assignment operator. If there is a move assignment operator (e.g. in our C++11 version), then the line cb = C10(33); will use it, otherwise (e.g. in our C++98 version) that line will use the copy assignment operator.

Where do the actual data copies occur? In the copy assignment operator (#C) and in the copy constructor (*C, not called at all in the example). By defining a move assignment operator in C++11, we can prevent the copy assignment operator from getting called, thus we can avoid a copy when assigning from a temporary (rvalue).

Please note that the return value optimization avoids the copy in the C ca = C10(11); statement. This works even in C++98, without the move constructor.

In C++98, a copy can be avoided by using swap at the call site, for example if the caller replaces user_name = GetUserName(); with { std::string tmp = GetUserName(); user_name.swap(tmp); }, then the copy will be avoided: the definition of tmp takes advantage of the return value optimization, and swap swaps only the pointers and the sizes, not the actual data.

2015-02-12

How to generate all subsets of size k in Java

This blog post shows a Java class to generate all subsets of size k of the set {0, 1, ..., n - 1}, in lexicographic order. The code uses O(k) memory, and it doesn't store multiple subsets at the same time in memory. This is achieved by implementing the Iterable interface.

import java.util.Iterator;
import java.util.NoSuchElementException;

/**
 * Generate all subsets of {0, ..., n-1} of size k, in lexicographically
 * increasing order.
 */
public class Fixsub implements Iterator<int[]>, Iterable<int[]> {
  private int n;
  private int k;
  private int[] a;
  public Fixsub(int n, int k) {
    assert(k >= 0);
    assert(n >= k);
    this.n = n;
    this.k = k;
  }
  private int findIdxToIncrease() {
    int i;
    for (i = k - 1; i >= 0 && a[i] == n - k + i; --i) {}
    return i;
  }
  @Override public Iterator<int[]> iterator() {
    return this;
  }
  /**
   * Always returns the same array reference, the caller is responsible for
   * making a copy. The caller shouldn't modify the array elements.
   */
  @Override public int[] next() {
    if (a == null) {
      a = new int[k];
      for (int i = 0; i < k; ++i) {
        a[i] = i;
      }
    } else {
      int i = findIdxToIncrease();
      if (i < 0) throw new NoSuchElementException();
      for (++a[i++]; i < k; ++i) {
        a[i] = a[i - 1] + 1;
      }
    }
    return a;
  }
  @Override public boolean hasNext() {
    return a == null || findIdxToIncrease() >= 0;
  }
  @Override public void remove() {
    throw new UnsupportedOperationException();
  }

  public static void main(String[] args) {
    for (int[] p : new Fixsub(7, 4)) {
      StringBuilder sb = new StringBuilder();
      for (int i = 0; i < p.length; ++i) {
        if (i != 0) sb.append(',');
        sb.append(p[i]);
      }
      sb.append('\n');
      System.out.print(sb);
    }
  }
}

2014-10-26

How to download some labeled messages from Gmail to a Unix computer

This blog post explains how to download some Gmail messages (distinguished by a label) to a computer running in Unix. The download method shown works in headless mode, so it can be run from cron etc.

Preparation instructions

  • You will need a computer running Unix and which can run Fetchmail. Your mail will be downloaded to a file (in mbox format) onto that computer.
  • If Fetchmail is not installed to that computer, you need to install it (which needs root privileges) or get the admin install it for you.
  • You will have to add your Gmail e-mail address and password to a config file which will be stored on the Unix computer running fetchmail. If this is not secure enough for you, then please don't continue.

Gmail setup instructions

  • Create a Gmail account if you don't already have one.
  • Log in to Gmail in a web browser (can be on any computer).
  • In Settings (the gear button) / Forwarding and POP/IMAP, select Enable IMAP, and save the changes. Click on the Save button.
  • Create two labels, one of them (let's call it foo auto) will be used by Gmail to track which messages have been downloaded already. After a download, Gmail will automatically remove that label. The other label (let's call it foo manual) is for your reference only.
  • If you already have some e-mail in your inbox to be downloaded, apply both labels to them, manually (or by searching).
  • If needed, set up a filter (in Settings / Filters / Create new filter) which will apply both labels to incoming messages you want to be get downloaded.
  • Go to https://www.google.com/settings/security/lesssecureapps in your browser, and enable less secure apps (such as Fetchmail).

Unix computer setup instructions

  • Log in to the Unix computer you wan to download the messages to. Typically such login is done using SSH.
  • Install fetchmail, put it to $PATH (usually /usr/bin/fetchmail from package). Version 6.3.9 works, but probably older versions work too. On Debian and Ubuntu, it is as easy as running (without the $):
    $ sudo apt-get install fetchmail
  • Create a directory which will hold your downloaded mail. For example:
    $ mkdir downloaded.mail
    $ cd    downloaded.mail
  • Create a plain text config file with the contents below, and copy it to the Unix computer, to the downloaded.mail directory. Typically the copying is done using scp or rsync. Don't forget to change the USERNAME, PASSWORD and foo auto settings to reflect your Gmail account. (If it's inconvenient for you to edit files on the Unix computer, change it first locally, and then copy it again).
    defaults:
    mda "exec >>download.mbox && echo From MAILER-DAEMON Thu Mar 29 23:43:41 2007 && cat"
    poll imap.gmail.com
    proto IMAP
    user "USERNAME@gmail.com"
    pass "PASSWORD"
    # Gmail will auto-remove this label as soon as the message has been downloaded,
    # so it won't get downloaded again at the next run.
    folder "foo auto"
    ssl
    # Also download read messages.
    fetchall
  • Make sure the config file on the Unix computer has the filename download.fetchmailrc. Here is an example how to rename it:
    $ mv download.fetchmailrc.txt download.fetchmailrc
  • Revoke other users' access to the config file, protect your password from being stolen. Please note the star at the end:
    $ chmod 700 download.fetchmailrc*
  • Create the output mbox file and protect it:
    $ : >>download.mbox
    $ chmod 700 download.mbox
  • In the downloaded.mail directory, run:
    $ fetchmail -f download.fetchmailrc

    This will download and append all your Gmail messages with the label foo auto to the file download.mbox to the downloaded.mail directory on the Unix computer, and remove the label foo auto, so when you run the command again, messages already downloaded won't be downloaded again. (Gmail labels are global, so you have to define additional labels if you want to download mail to several computers.)

  • If you get this error:
    .../.fetchmail.pid: Permission denied
    Then add --pidfile fetchmail.pid (without the quotes) to your fetchmail command-line.
  • If you get this error:
    fetchmail: Authorization failure on ...@gmail.com@...
    fetchmail: Query status=3 (AUTHFAIL)

    Then visit http://www.google.com/accounts/DisplayUnlockCaptcha from any web browser, unlock it, and run the fetchmail command again.

  • If needed, set up a cron job which will download automatically for you. Typically you can download once per minute, once per hour or once per day using cron jobs.

Incremental download instructions

  • Log in to the Unix computer you wan to download the messages to. Typically such login is done using SSH.
  • Change to the downloaded.mail directory:
    $ cd downloaded.mail
  • In the downloaded.mail directory, run:
    $ fetchmail -f download.fetchmailrc

    Messages already downloaded won't be downloaded again, because they don't have to foo auto label anymore. (Gmail labels are global, so you have to define additional labels if you want to download mail to several computers.)

2014-10-21

On the speed of memset

This blog post presents some of the speed measurements I've done with alternative implementations of memset.

I was filling a 1 GB memory area 20 times (equivalent to memset(a, 0, 1 << 30) each) with various different implementations, and measuring the speed. I've compiled the program for i386 (gcc -m32) and amd64 (gcc -m64), I ran it on desktop PC running 64-bit Linux 3.13.0 on a Xeon CPU (Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz) with 32 GB of RAM. I was compiling the code with GCC 4.6.3, with optimization level gcc -O2, because gcc -O3 optimized the 1-byte-at-a-time loop to a 4-bytes-at-a-time loop.

It turned out that there was no measurable difference in the i386 and amd64 version of the programs. Here are relative speeds (higher numbers are proportionally faster) of user times:

  • 1.480: memset, doing rep stosd, 4 bytes at a time
  • 1.000: 16 writes of 4 bytes each in loop the body (like Duff's device)
  • 1.000: 1 write of 8 bytes in the loop body
  • 1.000: 1 write of 4 bytes in the loop body
  • 0.675: 1 write of 2 bytes in the loop body
  • 0.329: 1 write of 1 byte in the loop body (char *cp = a, *cpend = a + sizeof(a) / sizeof(a[0]); for (; cp != cpend; ++cp) *cp = 0;)

I can interpret the numbers except for memset the following way: the cache doesn't help at this size; CPU does correct branch prediction (or taking the branch is fast enough), or taking a branch is faster than writing to memory; the data bus between the CPU and the memory can take 4 bytes at a time.

But why is the assembly instruction rep stosd that much faster than any of the loops? What's the magic behind it? It looks like my CPU had an optimized rep stosd and rep stosb built in, called ERMSB. More details in the PDF available from here (search for memset within the downloaded PDF).

2014-09-25

Change 1 character puzzle #2

Add or replace at most 1 character in the longest line of the following program so that it will print 20 dots. There are at least 10 solutions.

#include <stdio.h>
int main() {
  int i, n = 20; for (i = -n; n > i; i++) putchar('.');
  return 0;
}

SPOILER ALERT! Here are 10 solutions:

int i, n = 10; for (i = -n; n > i; i++) putchar('.');
int i, n = 20; for (i = -0; n > i; i++) putchar('.');
int i, n = 20; for (i = -!n; n > i; i++) putchar('.');
int i, n = 20; for (i = n-n; n > i; i++) putchar('.');
int i, n = 20; for (i = -n; !n > i; i++) putchar('.');
int i, n = 20; for (i = -n; 0 > i; i++) putchar('.');
int i, n = 20; for (i = -n; n * i; i++) putchar('.');
int i, n = 20; for (i = -n; n & i; i++) putchar('.');
int i, n = 20; for (i = -n; n = i; i++) putchar('.');
int i, n = 20; for (i = -n; n , i; i++) putchar('.');

Change 1 character puzzle #1

Add or replace at most 1 character in the longest line of the following program so that it will print 20 dots. There are at least 3 solutions.

#include <stdio.h>
int main() {
  int i, n = 20; for (i = 0; i < n; i--) putchar('.');
  return 0;
}

SPOILER ALERT! Here are 3 solutions:

int i, n = 20; for (i = 0; -i < n; i--) putchar('.');
int i, n = 20; for (i = 0; i < n; n--) putchar('.');
int i, n = 20; for (i = 0; i + n; i--) putchar('.');

2014-09-08

How to keep to your commitments without using up too much willpower

This blog post explains how hyperbolic discounting causes you to change your mind, and how to trick it so that you can keep to your commitments without using up too much willpower.

A student who doesn't learn as much as he wants to

A student has an exam on Friday. He has 4 days (Monday, Tuesday, Wednesday and Thursday) to learn. The more he learns, the more points he will get for the exam. On each learning day, he may choose to skip learning and go to a party instead. It happens that the previous week he decides that he will learn for 3 days, but during the week it turns out that he learns only once, and goes to parties on 3 days. How can this happen?

Explanation with hyperbolic discounting

This can be explained by assuming that the mind does hyperbolic discounting, i.e. it computes today's utility of a future event by dividing the utility of the event divided by the number of days of delay. See the more specific example below:

Hyperbolic discounting for time for the utility of u today:

  • same thing tomorrow: u/2
  • same thing in 2 days: u/3
  • same thing in 3 days: u/4
  • same thing in 4 days: u/5
  • same thing in 5 days: u/6

Utility function for today:

  • take the exam, get x points (where x is the number of days previously spent learning for the exam, x in 0..4): 60*x
  • party: 29
  • have a rest: 0
  • learn: 0
Previous Sunday:
  • learn on Monday–Friday: 60*4/6 = 40
  • party on Monday–Friday: 29/2+29/3+29/4+29/5 = 37.216...
  • party on Monday–Tuesday, learn on Wednesday and Thursday: 29/2+29/3+60*2/6 = 44.166...
  • learn on Monday–Tuesday, party on Wednesday and Thursday: 60*2/6+29/4+29/5 = 33.05
  • party on Monday, learn on Tuesday–Thursday: 29/2+60*3/6 = 44.5
  • party on Monday–Wednesday, learn on Thursday: 29/2+29/3+29/4+60/6 = 41.416...
  • ... (there are a few other cases)
  • what will happen: The student decides to party on Monday, and to learn on Tuesday--Thursday.

Monday:

  • learn on Monday–Thursday: 60*4/5 = 48
  • party on Monday, learn on Tuesday–Thursday: 29+60*3/5 = 65
  • party on Monday–Tuesday, learn on Wednesday–Thursday: 29+29/2+60*2/5 = 67.5
  • party on Monday–Wednesday, learn on Thursday: 29+29/2+29/3+60/5 = 65.166...
  • party on Monday–Thursday: 29+29/3+29/3+29/4 = 55.583...
  • ... (there are a few other cases)
  • what will happen: The student parties on Monday, and decides to party on Tuesday, and to learn on Wednesday--Thursday.

Tuesday:

  • learn on Tuesday–Thursday: 60*3/4 = 45
  • party on Tuesday, learn on Wednesday–Thursday: 29+60*2/4 = 59
  • party on Tuesday–Wednesday, learn on Thursday: 29+29/2+60/4 = 58.5
  • party on Tuesday–Thursday: 29+29/3+29/3 = 48.333...
  • ... (there are a few other cases)
  • what will happen: The student parties on Tuesday, and decides to learn on Wednesday--Thursday.

Wednesday:

  • learn on Wednesday–Thursday: 60*2/3 = 40
  • party on Wednesday, learn on Thursday: 29+60/3 = 49
  • party Wednesday–Thursday: 29+29/2 = 43.5
  • ... (there is one more case)
  • what will happen: The student parties on Wednesday, and decides to learn on Thursday.

Thursday:

  • learn on Thursday: 60/2 = 30
  • party on Thursday: 29
  • what will happen: The student learns on Thursday.

Friday:

  • take the exam on Friday: 60
  • don't take the exam on Friday: 0
  • party on Friday: 29
  • what will happen: The student takes the exam, scoring 1 point.

So hyperbolic discounting causes the student to change his mind several times during the week: he will do something else on Tuesday and Wednesday than what he decides on Sunday. On Sunday he was planning to learn later next week, and get 3 point on the exam. But he will end up learning only on one day, and getting only 1 point on the exam.

How to keep to your commitments you've made to yourself?

Instead of hyperbolic discounting, let's calculate with exponential discounting.

Exponential discounting for time for the utility of u today:

  • same thing tomorrow: u/2
  • same thing in 2 days: u/4
  • same thing in 3 days: u/8
  • same thing in 4 days: u/16
  • same thing in 5 days: u/32

It can be proven that with exponential discounting you would never change your mind: the relative utility of outcomes remains the same as time passes, there won't be any inconsistencies.

Unfortunately, some experiments show that hyperbolic discounting is hardwired into your mind, and you have no conscious power over it, you can't replace it with exponential discounting at will.

Excercising willpower is easier said than done, particularly that each person has a limited amount of willpower, and if it gets depleted, one has to wait for it to get refilled. So depending on the challenges you had previously on that day, you may have no willpower left to stay away from the party.

Another option you have is limiting and committing yourself to the same choice for a prolonged period of time. For example, if the student commits himself to do the same activity on Monday, Tuesday and Wednesday, then he will choose learning on all the 4 days, and he will actually do it.

2014-08-16

On generating integer-sided triangles with an integer area/perimeter ratio

This blog post contains come comments about finding all triangles whose sizes are integers and the area/perimeter ratio is the integer r. This is related to Project Euler problem 283.

The blog post contains a nice analysis and some pseudocode for generating all possible (a, b, c) size triplets for the given ratio r. Unfortunately both the analysis and the pseudocode contains some mistakes.

Here is the corrected version:

def find_triangles_correct(r):
  for u in divisors of 2r:
    for v in 1 to floor(sqrt(3) * u):
      if gcd(u, v) == 1:
        lhs = 4*r*r*(u*u+v*v)
        for d1 in positive divisors of lhs:
          if d1 <= lhs / d1:
            d2 = lhs / d1
            if not (u < v and d1*u < (2r(v*v-u*u))):
              if ((d1+2ru) mod v) = 0 and ((d2+2ru) mod v) = 0:
                a = (d1 + 2ru) / v + 2rv / u
                b = (d2 + 2ru) / v + 2rv / u
                c = (d1 + d2 + 4ru) / v
                yield a, b, c
Some mistakes and other inaccuracies in the original:
  • The variable r were silently renamed to m.
  • The formulas for a, b and c were not included.
  • The analysis incorrectly states that v must be smaller than floor(sqrt(3)*u)). The correct statement is: v must be smaller than sqrt(3)*u. Before the correction some solutions such as (6, 8, 10) for r=1 were not generated.
  • The condition d1 <= lhs / d1 had a >= instead.

Please note that the algorithm yields each triangle once, in an unspecified order of a, b and c within the triangle.

2014-08-12

How the lifetime of temporaries is extended because of references in C++

This blog post summarizes what I learned about C++ today about how the lifetime of temporaries is extended because of references.

Normally a temporary is destroyed at the end of the statement, for example temporary string returned by GetName() lives until the end of the line defining name. (There is also the return value optimization which may eliminate the temporary.)

#include <stdio.h>

#include <string>
using std::string;
string GetName() {
  return string("foo") + "bar";
}
int main() {
  string name = GetName();
  printf("name=(%s)\n", name.c_str());
  return 0;
}

However, it's possible to extend the life of the temporary by assigning it to a reference-typed local variable. In that case, the temporary will live until the corresponding reference variable goes out of scope. For example, this will also work and it's equivalent to above:

#include <stdio.h>
int main() {
  const string &name = GetName();
  printf("name=(%s)\n", name.c_str());
  return 0;
}

Please note that there is no lifetime extension if the function returns a reference, and also in some other cases related to function boundaries. See more details about reference initialization.

2014-07-29

How to discard committer info from git history

This blog post explains how to overwrite the git committer name, git committer e-mail and git committer date in previous commits in a git repository. This is useful e.g. after a series of git commit -C ... calls, which copy the author name, e-mail and date from the specified commit, but use the committer name, e-mail and date from the environment of the command.

Please note that rewriting git history is potentially dangerous because it can lead to data loss and synchronization issues with others who pull from the repository. Read more about it in Git Tools – Rewriting History.

First make sure you don't have any uncommitted changes: git status -s shouldn't print anything.

Then checkout the relevant branch, and run this command on a Unix systems (or within Git Bash on Windows), without the leading $:

$ git filter-branch -f --commit-filter '
    GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME"
    GIT_COMMITTER_EMAIL="$GIT_AUTHOR_EMAIL"
    GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"
    git commit-tree "$@"' HEAD

Repeat running this command in each local branch you care about. For each branch, it will rewrite the history of the entire branch from its roots.

If you are sure that you didn't have a data loss, then run:

$ rm -rf .git/refs/original

2014-07-08

Programming tips for students

This blog post gives you advice to make the best use of your time for learning how to program and enjoying programming in your pre-college years.

General, technology-independent recommendations

The best situation is if you have an experienced programming mentor, or some friends or schoolmates who are a bit ahead of you, and you have the chance to observe what these people do, and can ask questions. If that's the case, you are probably better off observing and asking these people than to follow the advice below.

If you are an absolute beginner, read Message to the students first. It contains many links where you can start learning how to program.

If you or your school can afford buying books, see the list of recommended books. The home or school library may have outdated and moderately useful course material, it's worth getting and reading the best books instead settling with what's already available.

Learn English to the extent that you can read and understand everything related to programming, and you can ask technical questions clearly in written form. If you have several languages to start learning, always pick English as the first language. It's much more useful to improve your English skills to fluent (so you can give a presentation to an audience of 1000 people in English just as easily as in your mother tongue) than to improve your skills in any other language from basic to medium.

Learn typing quickly, and without looking at the keyboard. Z-Type is a nice way to practice.

Get a large screen (at least full HD resolution is strongly recommended) and a comfortable keyboard and mouse. Try many different keyboards and mice -- most of them are cheap. (Cheap, wired ones can be much better than expensive ones.) Pick one keyboard layout (US English or your local one) and stick to it.

Learn how to ask questions so you get the answer you need quickly. Start learning from How To Ask Questions the Smart Way.

There are many great websites which teach you programming for free, even if you have no prior experience. See a list of them in Message to the students.

To become a skilled and experienced programmer, practice, i.e. write lots of code. Start that early. A minute spent on writing code is better spent than an hour of reading a book or a web page about programming.

If you get stuck, ask a question on StackOverflow. It's free. You will get several answers in 5 minutes from experienced programmers, you can pick the best answer, or ask for clarification. To modify specific behavior of specific web pages, write JavaScript which modifies the DOM, and use Greasemonkey with Firefox to run it each time to visit the page. Google Chrome can also run Greasemonkey scripts.

If you need something, write a small program for (parts of) it. That's easy to start on Linux and shell scripts and Python.

Start using a version control system (e.g. Git, Subversion) as soon as you have written a program of at least 100 lines of code. (But definitely start after 1000 lines.) The rule of thumb is: if you (or others) worked more than 15 minutes on it in total, then add the code to version control system. Start early with using a version control system, don't procrastinate. Put every program file you write to a repository managed by a version control system. Commit often, about once per hour, but definitely at least once per day. Make backups regularly (at least once per week) of everything you write. The simplest way to do it is to copy everything to an external hard drive, and disconnect the hard drive after the copy has finished. Using online backup services and version control systems (e.g. Git, Subversion) also serves as an additional backup for some of the code you write — but also make a traditional backup.

Find good programmers around you (parents, relatives, schoolmates, friends, teachers). Ask lots of technical questions (any question you think they may be able to answer) in person. With their permission, observe them while they are working with the computer. Afterwards, ask why they did something the way they did it. Ask their opinion about technologies they are working with. Ask their favorite programming language etc., and ask why. Figure out their favorite technical topic, and ask them about it (it may take several hours to listen to the answer, but these are one of the best spent hours). If you trust them and you respect their skills, ask their advice. Ask enough questions before you decide if they are smart or not.

Participate in programming contests. You can practice for Google Code Jam online. For less competitive, and easier to solve problems, see Project Euler. topcoder also has great problems and a great community.

Arrange your windows on screen so that they don't overlap at all. Cycling through overlapping windows contribute a lot to lost productivity. Use virtual desktops. On Linux, use a tiling window manager if necessary. Edit your program in one window, and see the results in another window (e.g. web browser, command-line, PDF viewer). Make both windows visible at the same time (without overlap). Once set up, don't cycle through them, don't resize them and don't move them: just see them both at the same time, and work in either one. If necessary, create more than 2 windows (maybe 2 * 2 or 3 * 2), without overlap.

Recommended sites, products and technologies

Install Linux as (try to) use it as your main operating system. Linux is very much customizable and scriptable (all the command-line tools described in the book The Unix Programming Environment are available). Be careful, installing another operating system (such as Linux) may lead to data loss and may render your computer temporarily unusable (but an expert can fix that). So make a full backup first, or install Linux to an otherwise unused computer. If Linux is not an option, try to use a Mac instead, and install free software from MacPorts. The Mac OS X is a variant of Unix, with the tools described in the book The Unix Programming Environment available.

To get quick and high quality answers to your technical questions: StackOverflow and other StackExchange sites.

To host the source code of your open-source projects (including a version control system, a wiki, an issue tracker and a code review tool): github and Google Code project hosting. There are many other providers, so compare the features and the code ownership, publicity and copyright issues before choosing.

Free C++ IDE: Code::Blocks.

Free code editor for HTML and scripting languages (Python, PHP, Ruby, JavaScript, Perl, TCL, XML, HTML, CSS): Komodo Edit.

Use GNU Make or some other incremental build tool to make sure that the relevant parts of your program gets recompiled.

Recommended books for programming students

This blog post lists the best books I can recommend to students who want to learn how to program. Please note that programming is more about doing than reading, see Message to the students for a more general picture.

Basics not tied to a specific programming language:

  • Absolute beginners should follow the Message to the students guide first, and finish some online courses about programming.
  • Rivest et al.: Introduction to Algorithms, 3rd edition.
  • Henry S. Warren, Jr: Hacker's Delight, 2nd edition. Chapter 2 (one of the best chapters) is available for download for free.
  • Donald E. Knuth: The Art of Computer Programming, Volume 1, Volume 2, Volume 3.
  • Brian W. Kernighan and Rob Pike: The Unix Programming Environment, 1984.
  • Steve McConnell: Code Complete, 2nd edition.

Basics for some important programming languages and technologies::

Gems specific to programming languages and technologies beyond the basics:

  • C++: Scott Meyers: Effective C++, 3rd edition.
  • Java: Joshua Bloch: Effective Java, 2nd edition.
  • Java: Joshua Bloch and Neal Gafter: Java Puzzlers, 2005.

2014-06-20

Message to the students

Dear Student,

This is your life. Make it happy, productive and full for yourself! The smarter and more skilled you are, the more choices and flexibility you have (what to work on, who to work with, where to work and live, when to switch jobs). Follow your curiosity! If you are particularly interested in a topic, use as much as possible of your free time to study, to practice and to do it. 10000 hours is enough to become a world-class expert or champion. Start now! Find teachers, researchers (on the university), craftsmen, and other people doing or learning the same thing (your friends, or any related group on meetup.com), and cooperate with them. Find the best books about the topic online, and read them. You have to do the research to find the people and books which can help you. It's your life and for your happiness, so do it now, and continue doing it! Listen to recommendations and suggestions, use them to figure out what you like, and then in most of your free time focus your attention on topics which fascinate you.

If you are too busy reading through the rest, then try these web resources for learning programming:

If you are unsure if programming or computer science is for you, ask a programmer, or read on for figuring out! If you are interested, but if you are not sure if your friends would think it's cool, then ask yourself: whose life is it, and what would matter for you 10 years from now. If you are sure that programming is not your cup of tea, then take a look at Coursera, where you can learn more than 600 subjects online. (All the websites recommended here are free to use, all you need is a desktop or laptop computer with internet access.) It's like a class (it takes several weeks, you have to learn for a few hours each week). Teachers from a few of the best universities have uploaded the material in Coursera, and they will also check your homework. You can browse through the list of courses, or you can search for your favorite topics. (There are even some courses about programming.) Unfortunately, not all courses can be started immediately, for some of them you have to wait a few months (just add them to your watchlist). If there is too much choice for you and you don't know which one to pick, then start with A Brief History of Humankind (no homework, you just have to watch the videos, the German version of the book is available). You may even want to buy the book by the same author.

To start with programming, play Lightbot first (it takes a few hours). There is also an iPhone and and Android app. Much of programming consists of sitting in front of a computer and typing the program. Learn touch typing (i.e. typing quickly without looking at the keyboard). Z-Type is a nice game in which you can practice typing English. Decide on a keyboard layout, and learn where the special symbols are (e.g. # and {). If typing is still slow and frustrating for you, don't give up: after a few months of practice, your fingers learn where the keys are. In the meantime, you can use Scratch to write some spectacular programs (even simple games) using mostly the mouse, with very little typing. Scratch can do 2D (2 dimensional) graphics only. Blocky is an alternative of Scratch. Try it if Scratch doesn't work well on your computer or it looks too complicated. If you are interested in 3D, use AgentCubes. Unfortunately it's not as user friendly as Scratch, it's more clunky, but it's still enjoyable.

Programming is a creative activity, mostly at the computer. The immediate output is the program code you type, and the final output is the program which runs, and makes people's life easier or more fun. (It's OK if it's fun only for you in the beginning.) The other aspect of it is computer science, which is done mostly at the whiteboard and reading books, mostly thinking about how a computation can be done can be done as quickly as possible for large inputs. (A typical computation: you have to visit 30 cities in a week (and do some activity in each of them), you know the distances between each 2 cities, and you want to make your trip as short as possible. In real-life problems there can be thousands of cities.) This is very similar to the non-geometry parts of math, so if you enjoy solving math puzzles, than probably you would also enjoy learning and doing computer science. You can start with the Principles of Computing lecture on Coursera, and then try the computer science section of Khan Academy. The best classic books in the topic are The Art of Computer Programming (start with the first 3 volumes) and Introduction to Algorithms. If you open these books, you'll see lots of math formulas and lots of graphs (with nodes, edges and labels) in them. That's normal, these are the everyday tools of a computer scientist.

So, back to programming. If there is nobody around to teach you, or you'd like to move at your own pace, start on Khan Academy, by clicking on the Learn Programming link. You don't need a teacher sitting or standing next to you, just follow the instructions on the website. Like there are many languages to tell other people what you want, there are many programming languages to instruct the computer. Khan Academy teaches you JavaScript, one of the most popular languages today. Another similar one is Crunchzilla, which also teaches you JavaScript in a visual way, step-by-step. Eventually, over the years, you should learn many programming languages, such as Python, Java, C, C++, Ruby. (This is the list  of best programming languages to learn changes quickly, maybe in 5 years from now it would be different.). If you want to build web pages, you should learn HTML and CSS, which are languages to describe text formatting and visual layouts of web pages. In addition to JavaScript, you can learn Python, Ruby, HTML, CSS and some other languages at Codecademy. You don't have to register or log in: just scroll to the bottom, select the language you want to learn (e.g. HTML/CSS), and then just follow the instructions, it will teach you step-by-step. There are also programming courses in Coursera, for example one of them teaches you to build your own game in Python. Unfortunately you have to wait several months for that to start. In the meantime, you can also learn game programming in JavaScript at CodeAvengers. If you are interested in building a mobile app for Android phones (no iPhone), then do a few of the tutorials at App Inventor.

If you like nice, eye-candy design with music, then Code Combat will teach you JavaScript in a fantasy (medieval) atmosphere: you have to rescue kidnapped town members from prisons guided by trolls etc. A similar website is Ruby Warrior, you can practice the Ruby language there, but it's more enjoyable if you have some prior programming experience, so don't start there. Both Code Combat and Ruby Warrior are quite resource-hungry, so you need a very modern and fast computer to play smoothly, and you have to close most other programs.

HTML (for text and structure), CSS (for visual layout) and JavaScript (for interactive elements such as animation and reaction to mouse clicks) are the 3 most important languages today to build websites: the code written in those languages are executed (interpreted) by the web browsers, and that code will determine how the web page will look and sound like and how it will behave. (There are many other programming languages for the server side, i.e. to make the website remember your data when you visit it from a different computer or browser.) As mentioned above, learn these 3 languages from Codecademy. You can also practice your CSS skills in a playful way on Flukeout (but you have to learn it first somewhere else). If you want to build a website, save it and show it off to your friends, Thimble is a nice and easy-to-use tool for that. It supports HTML, CSS and JavaScript. You have to register and log in first in order to be able to save your creation.

When real programmers work and write code, they usually work in a text-editor or an IDE (integrated development environment), and not in a web browser. To make it easy to start for you, most of the learning material mentioned above needs only a web browser with internet access. After you have finished one or two programming classes above, you should give it a try with the real text editor. There are step-by-step instructions for that in Python in the Programming with Python (do this first) and Data Processing with Python courses. Just follow the instructions, it tells you what to install and how to use it.

Once you get good at programming (i.e. you've finished many programming courses, and when you start the next one, you find it too easy), and enjoy it a lot, you should try your skills (and possibly win some prizes) in one of the programming contests. For example, there is Google Code Jam, you can start practicing any time with any Round 1 (or Round 1A). It works best if 3 or more of you work together, trying to solve one of the problems together, and then you go on separately for the 2nd problem, and at the end if anyone has a correct solution, they show the others how they did it. You can also ask your teacher if there are other programmings contest closeby where you live, and if there are some preparation classes you can attend. As soon as you know the basics of programming, the most you can learn from is solving contest problems and understanding other people's analyses and solutions. Project Euler is similarly useful, but it contains more math-oriented and theoretical problems. After solving the first few easy problems, you'll need your best math, programming and computer science skills to solve the rest.

Recommended reading before and during university for computer science students: What every computer science major should know.

Good luck and have fun!