Archive for the ‘Tech’ Category
Integrate Astrid with Thunderbird
Astrid is just yet another todo list in the cloud. It has some advanced features like the small built in social network that allows you to manage todo lists collaboratively. Furthermore, there is a nice little Android app for it. My use cases, however, are usually much simpler: I just want to remember stuff.
One of the things that I regularly put on a todolist is replying to emails. This happened so frequently that I was wondering whether there was any Thunderbird extension that supports the creation of new tasks with Astrid. A typical, basic scenario might for instance be the following:
 Thunderbird receives a new email with the subject “XYZ”
 I highlight the new message and click on a button in the toolbar
 A new task with the title “Reply: XYZ” gets added to an Astrid list called “mail” with due date set for today
An advanced version could include the following functionality:
 When I click said button, the email is assigned a “TODO” tag in Thunderbird
 When I mark the task as finished in Astrid, the tag is removed
 When I remove the tag in Thunderbird, the task is marked as finished in Astrid
Using saved searches, this could create the illusion of a virtual todo list within Thunderbird.
Unfortunately I could not find any extension that would do the job – so I decided to come up with something myself. As I am not a Thunderbird hacker, my first approach is a quick and dirty solution to implement the basic usecase described above. I don’t have any experiences in developing Thunderbird extensions, therefore I am using Custom Buttons instead. This extensions allows me to create a new button in my Thunderbird toolbar and to hack together some JavaScript that gets called whenever the button is clicked.
Astrid has an API. They even provide a small JavaScript library. However, I did not want to go through the hassle of requesting an API key (not even knowing whether this will get me anywhere) and I am afraid that my JavaScriptfoo is somewhat limited. Thus, I decided to work around the problem using the email interface that Astrid provides – an idea that I stole from this blogpost.
Without further ado, this is what I came up with:
function getMessageBody(aMessageHeader) { // Credit goes to http://forums.mozillazine.org/viewtopic.php?f=19&t=1699075 var messenger = Components.classes["@mozilla.org/messenger;1"] .createInstance(Components.interfaces.nsIMessenger); var listener = Components.classes["@mozilla.org/network/syncstreamlistener;1"] .createInstance(Components.interfaces.nsISyncStreamListener); var uri = aMessageHeader.folder.getUriForMsg(aMessageHeader); messenger.messageServiceFromURI(uri) .streamMessage(uri, listener, null, null, false, ""); var folder = aMessageHeader.folder; return folder.getMsgTextFromStream(listener.inputStream, aMessageHeader.Charset, 65536, 32768, false, true, { }); } var mAccountMgrSvc = Components.classes["@mozilla.org/messenger/accountmanager;1"] .getService(Components.interfaces.nsIMsgAccountManager); var account = mAccountMgrSvc.getAccount("account3"); var identity = account.defaultIdentity; var composeFields = Components.classes["@mozilla.org/messengercompose/composefields;1"] .createInstance(Components.interfaces.nsIMsgCompFields); composeFields.characterSet = "UTF8"; composeFields.to = "new@astrid.com"; composeFields.from = identity.email; composeFields.replyTo = identity.replyTo; composeFields.subject = "Reply: " + gFolderDisplay.selectedMessage.mime2DecodedSubject + " #mail (today)"; composeFields.body = getMessageBody(gFolderDisplay.selectedMessage) + "\r\n"; var MsgComposeParams = Components.classes["@mozilla.org/messengercompose/composeparams;1"] .createInstance(Components.interfaces.nsIMsgComposeParams); MsgComposeParams.composeFields = composeFields; MsgComposeParams.identity = identity; MsgComposeParams.format = Components.interfaces.nsIMsgCompFormat.PlainText MsgComposeParams.type = Components.interfaces.nsIMsgCompType.New; var msgSend = Components.classes["@mozilla.org/messengercompose/send;1"] .createInstance(Components.interfaces.nsIMsgSend); var MsgCompose = Components.classes["@mozilla.org/messengercompose/compose;1"] .createInstance(Components.interfaces.nsIMsgCompose); MsgCompose.initialize(MsgComposeParams); MsgCompose.SendMsg(msgSend.nsMsgDeliverNow, identity, account, null, null);
Before I proceed, let me add that I am not a JavaScript hacker and the above is thus probably terrible code. As I mentioned before, I just wanted to get something working (suggestions & improvements are welcome).
To get it to work, make sure that “account3″ is replaced by the (Thunderbird) key of the account with which you are registered at Astrid. As you can see, I add ” #mail (today)” to the subject line. The first part tells astrid to save the new task into a list called “mail”. The second part sets the due date of the new task to today.
This immediately reveals a potential problem of this code: If the subject of the original mail contains any #’s or other symbols recognised by Astrid, then this will lead to undesired effects. A big todo item is thus to add some escaping functionality. In the long run it is probably better to migrate to the Astrid API instead – in particular if I want to implement any of the advanced behaviour described above.
Training a POMDP (with Python)
Working on my Bachelor Thesis[5], I noticed that several authors have trained a Partially Observable Markov Decision Process (POMDP) using a variant of the BaumWelch Procedure (for example McCallum [4][3]) but no one actually gave a detailed description how to do it. In this post I will highlight some of the difficulties and present a possible solution based on an idea proposed by Devijver [2]. In the first part I will briefly present the BaumWelch Algorithm and define POMDPs in general. Subsequently, a version of the alphabeta algorithm tailored to POMDPs will be presented from which we can derive the update rule. Finally I will present a sample implementation in Python. If you are not interested in the theory, just skip over to the last part.
The BaumWelch Procedure
In its original formulation, the BaumWelch procedure[1][6] is a special case of the EMAlgorithm that can be used to optimise the parameters of a Hidden Markov Model (HMM) against a data set. The data consists of a sequence of observed inputs to the decision process and a corresponding sequence of outputs. The goal is to maximise the likelihood of the observed sequences under the POMDP. The procedure is based on Baum’s forwardbackward algorithm (a.k.a. alphabeta algorithm) for HMM parameter estimation.
The application that I had in mind required two modifications:
 The original forwardbackward algorithm suffers from numerical instabilities. Devijver proposed a equivalent reformulation which works around these instabilities [2].
 Both the BaumWelch procedure and Devijver’s version of the forwardbackward algorithm are designed for HMMs, not for POMDPs.
POMDPs
In this section I will introduce some notation.
Let the POMDP be defined by .
 is the state transition matrix. It defines the probability of transferring to state given that the current state is .
 is the output matrix. It defines the probability of observing output given that the current state is .
 is the number of states (or, equivalently: the dimensionality of the latent variable).
 is the number of outputs (the dimensionality of the observed variable)
 is the initial state distribution.
This may not be the standard way to define POMDPs. However, this vectorized notation has several advantages:
 It allows us to express state transitions very neatly. If is a vector expressing our belief in which state we are in time step , then (without observing any output) should express our beliefs in time step
 We can use it in a similar way to deal with output probabilities.
 It is (in my opinion) much nicer to implement an actual algorithm using the toolbox of vector and matrix operations.
Applying Devijver’s Technique to POMDPs
This section will demonstrate, how Devijver’s forwardbackward algorithm can be restated for the case of POMDPs
The aim of the POMDP forwardbackward procedure is to find an estimator for . Using Bayes’ Rule
The problem reduces thus to finding and which shall be called the forward estimate and the backward estimate respectively.
Repeated application of Bayes’ Rule and the definition of the POMDP leads to the following recursive formulation of :
This yields a recursive policy for calculating . The missing piece, , can be calculated using the preceding recursion step for :
The common term of the recursion and the recursion can be extracted to make the process computationally more efficient:
The result differs from the original formulation [2] merely by the fact that the appropriate transition matrix is chosen in every recursion step. It is still necessary to calculate , which can be reduced to a similar recursion:
The base cases of these recursions follow directly from their probabilistic interpretation:
Using the definitions of and it is now possible to derive an unbiased estimator for . It’s definition bears a striking resemblance to the estimator derived by Devijver [2]. Analogue to the steps Baum and Welch took to derive their update rule, the need for two more probabilities arises: and . Luckily, , and again can be used to compute these probabilities.
It turns out, that the values for calculated above can also be used to calculate the likelihood of the observed data under the current model parameters.
In practice the loglikelihood will be of more interest, as it’s calculation is more efficient and numerically more stable:
EMUpdate Rule
The methods derived up to this point are useful to calculate state probabilities, transition probabilities and output probabilities given a model and a sequence of inputs and outputs. As Baum and Welch did in the case of HMMs, these very probabilities will now be used to derive estimators for the model’s parameters. Let . Then
A similar scheme can be used to derive an estimator for the transition probabilities:
where can be approximated using the above estimator (this might mean that the new estimator is biased, not quite sure about that). The result is an estimator for . An estimator for the output probabilities can be derived accordingly, making use of the Markov property:
Thus,
This equation holds for every value of x. Therefore a better estimator can be derived by averaging the values over all inputs:
Implementation
To get a bit more concrete, I will add the Python code I wrote to execute the steps described above.
The definition of a POMDP. For simplicity, inputs and outputs are supposed to be natural numbers. The transition matrices corresponding to each of the input characters are stored in alist
(where alist[i]
is the transition matrix that corresponds to input symbol i
). As before, the matrix that maps state to observation probabilities is given by c
. The initial state distribution is stored in init
.
class MarkovModel(object): def __init__(self,alist,c,init): self.alist = alist self.c = c self.ns = c.shape[1] self.os = c.shape[0] self.inps = len(alist) self.init = init
In the above sections, the procedure was stated in a recursive form. It is, however, not advisable to actually implement the algorithm as a recursion as this will lead to a bad performance. A better approach is to use a dynamic programming approach: The and values are stored in a matrix where
 each column corresponds to one point in time
 each row corresponds to one state
This is where actually most of the magic happens.
This function will generate
 a tableau for
 a tableau for and
 a tableau for
To make the computation of and more efficient, I also calculate the common factor as derived above:
def make_tableaus(xs,ys,m): alpha = np.zeros((len(ys),m.ns)) beta = np.zeros((len(ys),m.ns)) gamma = np.zeros((len(ys),m.ns)) N = np.zeros((len(ys),1)) # Initialize: gamma[0:1,:] = m.init.T*m.c[ys[0]:ys[0]+1,:] N[0,0] = 1 / np.sum(gamma[0:1,:]) alpha[0:1,:] = N[0,0]*gamma[0:1,:] beta[len(ys)1:len(ys),:] = np.ones((1,m.ns)) for i in range(1,len(ys)): gamma[i:i+1,:] = m.c[ys[i]:ys[i]+1,:] * np.sum((m.alist[xs[i1]].T*alpha[i1:i,:].T),axis=0) N[i,0] = 1 / np.sum(gamma[i:i+1,:]) alpha[i:i+1,:] = N[i,0]*gamma[i:i+1,:] for i in range(len(ys)1,0,1): beta[i1:i] = N[i]*np.sum(m.alist[xs[i1]]*(m.c[ys[i]:ys[i]+1,:]*beta[i:i+1,:]).T,axis=0) return alpha,beta,N
These tableaus can be used to solve many inference problems in POMDPs. For instance, given a sequence of inputs and a sequence of observations that correspond to , estimate the probability of being in a certain state during each state. Put differently: The function state_estimates
will calculate the posterior distribution over all latent variables.
I included an optional tableaus
parameter. This ensures that, if we want to solve several inference problems, we only need to calculate the tableaus once.
def state_estimates(xs,ys,m,tableaus=None): if tableaus is None: tableaus = make_tableaus(xs,ys,m) alpha,beta,N = tableaus return alpha*beta
With the formulas that we derived above and using the tableaus, this becomes very simple. Note that the standard meaning of the *operator in numpy is not matrix multiplication but elementwise multiplication.
A bit more sophisticated is the following inference problem: Given a sequence of inputs and a sequence of observations that correspond to , estimate the probability of transferring between two states at each time step.
def transition_estimates(xs,ys,m,tableaus=None): if tableaus is None: tableaus = make_tableaus(xs,ys,m) alpha,beta,N = tableaus result = np.zeros((m.ns,m.ns,len(ys))) for t in range(len(ys)1): a = m.alist[xs[t]] result[:,:,t] = a*alpha[t:t+1,:]*m.c[ys[t+1]:ys[t+1]+1,:].T*beta[t+1:t+2,:].T*N[t+1,0] a=m.alist[xs[len(ys)1]] result[:,:,len(ys)1] = a*alpha[1:,:] return result
The next function again takes an input sequence and an output sequence and for each time step computes the posterior probability of being in a state and observing a certain output.
Note that as the sequence of outputs is already given, this function does not add much: If in time step an output of 1 was observed, obviously the probability of observing 2 in time step will be zero.
def stateoutput_estimates(xs,ys,m,sestimate=None): if sestimate is None: sestimate = state_estimates(xs,ys,m) result = np.zeros((m.os,m.ns,len(ys))) for t in range(len(ys)): result[ys[t]:ys[t]+1,:,t] = sestimate[t:t+1,:] return result
Having defined these functions, we can implement the BaumWelch style EM update procedure for POMDPs. It simply calculates
 The posterior state estimates for each
 The posterior transition estimates for each
 The posterior joint state/output estimates for each
This method can be repeated until the model converges (for some definition of convergence). The return value of this function is a new list of transition probabilities and a new matrix of output probabilities. These two define an updated POMDP model which should explain the data better than the old model.
def improve_params(xs,ys,m,tableaus=None): if tableaus is None: tableaus = make_tableaus(xs,ys,m) estimates = state_estimates(xs,ys,m,tableaus=tableaus) trans_estimates = transition_estimates(xs,ys,m,tableaus=tableaus) sout_estimates = stateoutput_estimates(xs,ys,m,sestimate=estimates) # Calculate the numbers of each input in the input sequence. nlist = [0]*m.inps for x in xs: nlist[x] += 1 sstates = [np.zeros((m.ns,1)) for i in range(m.inps)] for t in xrange(len(ys)): sstates[xs[t]] += estimates[t:t+1,:].T/nlist[xs[t]] # Estimator for transition probabilities alist = [np.zeros_like(a) for a in m.alist] for t in xrange(len(ys)): alist[xs[t]] += trans_estimates[:,:,t]/nlist[xs[t]] for i in xrange(m.inps): alist[i] = alist[i]/(sstates[i].T) np.putmask(alist[i],(np.tile(sstates[i].T==0,(m.ns,1))),m.alist[i]) c = np.zeros_like(m.c) for t in xrange(len(ys)): x = xs[t] c += sout_estimates[:,:,t]/(nlist[x]*m.inps*sstates[x].T) # Set the output probabilities to the original model if # we have no state observation at all. sstatem = np.hstack(sstates).T mask = np.any(sstatem == 0,axis=0) np.putmask(c,(np.tile(mask,(m.os,1))),m.c) return alist,c
How to test that? Well, we’ve seen how to calculate the loglikelihood of the data under a model:
def likelihood(tableaus): alpha,beta,N = tableaus return np.product(1/N) def log_likelihood(tableaus): alpha,beta,N = tableaus return np.sum(np.log(N))
Summary and Caveats
This post showed how to learn a POMDP model with python. The technique seems to be reasonably numerically stable (while I experienced major problems with a version based on the original alphabeta method). Still there are some degeneracies that can occur, e.g. if a certain input was never observed the result of the division by nlist[xs[t]]
may not be defined. That is why I used that strange mask
construction to work around the problem. I do not claim that the implementation that I used is extraordinarily fast or optimised and I’d be glad about suggestions how to improve it.
I also experimented with a version of the function that creates a weighted average of the old and the new transition probabilities. This technique seemed to be more appropriate for some applications.
From a theoretical point of view the derivation I presented is no black magic at all: It turns out that learning a POMDP model is almost the same as learning a HMM, except that we have one transition matrix for each input to the system. Yet, it is still nice to see that it does work!
Finally, there is the unfortunate caveat of every EMbased technique: Even though the algorithm is guaranteed to converge, there is no guarantee that it finds the global optimum.
References
[1]  LE Baum, T Petrie, and G Soules. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 1970. [ http ] 
[2]  Pierre a Devijver. Baum’s forwardbackward algorithm revisited. Pattern Recognition Letters, 3(6):369373, December 1985. [ DOI  http ]

[3]  Andrew McCallum. Overcoming incomplete perception with utile distinction memory. In Proceedings of the Tenth International Conference on Machine Learning, pages 190196. Citeseer, 1993. [ http ]

[4]  Andrew McCallum. Reinforcement learning with selective perception and hidden state. Phd thesis, University of Rochester, 1996. [ .ps.gz ] 
[5]  Daniel Mescheder, Karl Tuyls, and Michael Kaisers. Opponent Modeling with POMDPs. In Proc. of 23nd BelgiumNetherlands Conf. on Artificial Intelligence (BNAIC 2011), pages 152159. KAHO SintLieven, Gent, 2011. [ .pdf ]

[6]  Lloyd R Welch. BaumWelch Algorithm. IEEE Information Theory Society Newsletter, 53(4), 2003. 
This file was generated by bibtex2html 1.95.
Hello Yampa
Having tried (more or less successfully) to find good introductory literature on the topic of Functional Reactive Programming (FRP) I decided to keep track of my own experiments and convert the resulting log to small tutorials. Take this as a disclaimer: I myself am trying to figure out how this all works. Any ideas how to do better are thus more than welcome.
Functional Reactive Programming is a technique to work with systems that change over time using the toolbox of functional programming. The idea is to treat the whole system as a “stream” which stems from some input, in between sits the programmer who is to decide how to use the input stream and create from it an output stream. Consider the following examples:
A GUI application
It is easily possible to interpret GUI applications as an instance of the reactive programming perspective:
 The input signal shall be the user inputs. Consider the case in which the only means of input is a mouse with just one button, then the signal is a tuple of the current coordinates and and a boolean indication whether or not the mouse button is currently pressed.
 The output signal is what is drawn on the screen. This might for example be a simple pixel array with RGB values for each point.
 The task of the programmer is to couple the input signal to the output signal. As an example, we probably want to move the graphical cursor on the screen according to the current position. This means that the corresponding pixels in the pixel array are changed accordingly.
A Robot Controller
A different use case is a controller of a robotic system. Imagine for example an arm with a number of joints to control. The fact that this is a system that exhibits some “behaviour over time” indicates that again FRP is applicable. And indeed:
 The input signal will be the information we get from the sensors, measuring e.g. joint positions.
 The output signal is whatever is sent to the robot actuators e.g. motor torques.
 The task of the programmer is to map from the sensor data to the actuator signal.
FRP Frameworks
There are several Haskell frameworks that support FRP. Unfortunately I find it difficult to estimate their respective level of maturity and overall quality. I will be focusing on Yampa which can be installed quite easily using cabal:
$ cabal install Yampa
The Haskell Wiki list some more FRP frameworks but I found the documentation rather sparse.
The “Hello World” Example
Let’s start with a simple “Hello World” application. The idea is to achieve the following: The input to the system is static signal with value “Hello Yampa”. This signal will be piped through the main program without being changed. Thus, the resulting signal should be just “Hello Yampa”. All the output function has to do is to print it via putStrLn and indicate that it wants to terminate the program.
import FRP.Yampa main :: IO () main = reactimate initialize input output process initialize :: IO String initialize = return "Hello Yampa" input :: Bool > IO (DTime, Maybe String) input _ = return (0.0, Nothing) output :: Bool > String > IO Bool output _ x = putStrLn x >> return True process :: SF String String process = identity
Let’s see bit by bit what happens here.
The type of the reactimate
function, which serves as the core of the application is as follows:
reactimate :: IO a > (Bool > IO (DTime, Maybe a)) > (Bool > b > IO Bool) > SF a b > IO ()
Basically, the type argument a
is the type of the input signal. In our case this will thus simply be a string. The first argument is the initial input data. As it is in the IO
Monad, we could retrieve this data from the actual physical system (e.g. the position of the mouse cursor).
The second argument of reactimate
is the function that supplies us with sensor data. I do not understand why it takes a Bool
parameter and according to this page, that parameter is not even used. Again, it is wrapped in the IO
Monad and we could use it to get the new position of the mouse cursor or the current sensor data. Let’s look at the type of the thing that is wrapped in IO
: It is (DTime, Maybe a)
. The first part of the tuple is the amount of time that has passed since the last time the input
function has been called. It is thus (in theory) our task to do the bookkeeping and to return the right value here. However, the way the Hello Yampa program works, input
will actually never be called, thus the exact return value does not matter. The second element can be Nothing
which translates to “there was no input”.
The return value of the output
function signals whether or not we want to terminate: True
stops the program. As in this simple example the return value is always True
the program will terminate immediately after having executed output
once.
More information about reactimate
can be found here.
Timed “Hello World”
Obviously, printing “Hello Yampa” on a screen would not be the example were FRP would typically be applied. Instead, we want to deal with systems that change over time. To include this time element, let us consider a slightly extended “Hello World”example in which
 the input consists of a constant stream of “Hello Yampa” and the current time
 the function is to filter all instances where the time is more than two seconds
 the output is to print the result to console as before
import FRP.Yampa import Data.IORef import Data.Time.Clock type Input = (String, DTime) type Output = Maybe String data Stopwatch = Stopwatch { zero :: UTCTime prev :: UTCTime } startStopwatch :: UTCTime > Stopwatch startStopwatch now = Stopwatch now now storeStopwatch :: IO (IORef Stopwatch) storeStopwatch = getCurrentTime >>= (newIORef . startStopwatch) diffTime :: (IORef Stopwatch) > IO (DTime,DTime) diffTime ref = do now < getCurrentTime (Stopwatch zero prev) < readIORef ref writeIORef ref (Stopwatch zero now) let dt = realToFrac (diffUTCTime now prev) timePassed = realToFrac (diffUTCTime now zero) return (dt, timePassed) main :: IO () main = do handle < storeStopwatch reactimate initialize (input handle) output process initialize :: IO Input initialize = return ("Hello Yampa",0) input :: (IORef Stopwatch) > Bool > IO (DTime, Maybe Input) input ref _ = do (dt,timePassed) < diffTime ref return (dt, Just ("Hello Yampa",timePassed)) output :: Bool > Output > IO Bool output _ (Just x) = putStrLn x >> return False output _ Nothing = return True process :: SF Input Output process = arr afun where afun (message, timePassed)  timePassed <= 1 = Just message  otherwise = Nothing
The first thing to notice here is the code that keeps track of the time that has passed. The start time and the time of the last call of input
are stored in an IORef
reference. I wonder why this kind of very generic looking code is not part of the reactimate loop.
The input
function now returns a tuple containing the string and the time that has passed. Furthermore it updates the time of the stopwatch. The process
function looks at the time that has passed and decides to either return Nothing
or to just pipe through the message.
Using the arrow syntax for which we need
{# LANGUAGE Arrows #}
this can be written this in a different way:
process :: SF Input Output process = proc (message, timePassed) > returnA < if (timePassed <= 1) then Just message else Nothing
I suppose that the arrow syntax does not really make sense in this simple case. Whenever the process
function returns Nothing
the program will stop.
In theory, the signal function is assumed to be continuous. Therefore we should get infinitely many instances of “Hello Yampa” printed to the screen. Obviously that’s not possible. And indeed, internally Yampa seems to discretize the signal function.
Conclusion
There are some things that puzzle me about Yampa. For instance:
 There are those seemingly unused boolean parameters in the input and output functions.
 I don’t quite understand why the whole timebookkeeping work is not done internally.
Yet I think that these two examples provide some intuition about how a Yampa program is structured. Also it becomes apparent what the beauty of this approach is to a Haskell programmer: All unpure operations can be located in the input
and output
function. The process
that translates input to output has no side effects at all. Now I have to look at some more sophisticated examples involving state.
Updates
 kickstarter for the Reactive Banana library. : Alfredo simultaniously worked on a
Nothing
in theinput
function actually corresponds to “there was no input”. I updated the text accordingly.
: Gerold noted below that a value of