Edit Distance
From SONIVIS:Wiki
Edit Distance
|
Edit distance is the amount of changes like the deletion of words, their movement or addition etc. between revisions. As revisions belong to a page and we consider time in this calculation we can measure this metric for a page in a dynamic model.
Objective
The edit distance describes the difference between two states of a page (revisions). This measure can point out the quantity of a made change and also the activity of an author regarding her/his content contribution with this revision. Summing all edit distances made by an single author describes the measure Edit Only.
Explanation (optional)
Within a page text can be changed. To define this change between two versions of a text the edit distance can be used. With this measure we count the words added, deleted, replaced, displaced etc. between two versions of text. It is a characteristics of a page in a dynamic model, to see how a page changes.
Like the authors B. T. Adler, L. de Alfaro, I. Pye and V. Raman[1] say these edit distances can be used to find the “direction” of a change. This is used to calculate the Edit size, Edit longevity and the Edit Only.
Calculation
B. T. Adler, L. de Alfaro, I. Pye and V. Raman[1] calculate the edit distance as following:
(a detailed treatment in "A content-driven reputation system for the Wikipedia"[1])
- d(vi,vj) for
, is the edit distance between vi and vj, and measures how much change (word additions, deletions, replacements, displacements etc.) there has been in going from vi to vj. Defining d(ri) = d(vi − 1,vi) for the edit contribution made in a revision ri.
- I(vi,vj) is the number of words that are inserted
- D(vi,vj) is the number of words that are deleted
- M(vi,vj) is the number of words that are moved
The edit distance can be computed using the block analysis.
| Version | Edit distance |
| 9 This is a text. | |
| 10 This is a lovely text. | d(9,10)= 1 word |
| 11 This is a very nice and lovely text. | d(10,11)= 3 words |
d(9,11)= 1 word + 3 words = 4 words
Note: A different calculation is also possible: sum the changed characters to define a edit distance.
Reference
@inproceedings{Adler2008,
title = {Measuring Author Contribution to the Wikipedia},
address = {Porto, Portugal},
author = {B. Thomas Adler and Luca de Alfaro and Ian Pye and Vishwanath Raman},
booktitle = {WikiSym ’08},
month = {September},
url = {http://www.wikisym.org/ws2008/proceedings/research%20papers/18500027.pdf},
year = {2008},
keywords = {Wikipedia contribution edit_longevity quality wiki}
}
@inproceedings{citeulike:1291537,
address = {New York, NY, USA},
author = {Adler, Thomas B. and de Alfaro, Luca },
booktitle = {WWW '07: Proceedings of the 16th international conference on World Wide Web},
doi = {http://dx.doi.org/10.1145/1242572.1242608},
keywords = {content-driven, reputation, wikipedia},
pages = {261--270},
publisher = {ACM Press},
title = {A content-driven reputation system for the wikipedia},
year = {2007}
}

