Algorithm Implementation/Strings/Dice's coefficient

WARNING
Please check, if the implementation in the progamming Language you want to use meets your needs !

Some implementations (example: new unrevised javascript implementation) count equal bigrams only once, which is inadequate for string similarity (similarity of "ggggg" and "gg" will be "1" etc. ) Read the comment inside the Java Implementaion (which in my opinion seems to be correct) for this ! Could someone, who is really a specialist on this topic, review this to clarify the problem, and do this warning in a correct way ? See Discussion !

Algorithm
Dice's coefficient measures how similar a set and another set are. It can be used to measure how similar two strings are in terms of the number of common bigrams (a bigram is a pair of adjacent letters in the string). Below we give implementations of Dice's coefficient of two strings in different programming languages.

From the Dice coefficient Wikipedia page, when taken as a string similarity measure, the coefficient may be calculated for two strings, x and y using bigrams as follows:

$$d = \frac{2n_t}{(n_x + n_y)} $$

where nt is the number of character bigrams found in both strings, nx is the number of bigrams in string x and ny is the number of bigrams in string y.

An alternative multiset-oriented description of d is as:

2 × |AB| / (|A| + |B|)

where A, B and AB are to be understood as multisets, with A and B corresponding to the bigrams in x and y respectively, and where AB is defined as the multiset such that, if an item, x, has multiplicity a in A and b in B, then it will have multiplicity min(a,b) in AB. Also for a multiset X, the notation |X| signifies the count of items in X.

Haskell
a naive, non-optimized version

C
A possible problem with the above example is that it considers the position of the bigrams as well as their presence/absence. An alternative implementation which is independent of bigram position is given below: