Two fields of the same fieldtype are considered similar if the set of words occuring in this fields are phonetic similar and have a large nonempty intersection. Two entries are considered similar if the majority of the checked fields is similar (title, author, booktitle, journal, publisher; title and author are weighted doubly) or equal (year, number, volume, pages, edition).
Two words are phonetic similar, i.e. they sound the same in English, if their soundex code is equal. Originally the soundex code [Knu73] is an indexing system which translates names into 4 digit code consisting of 1 letter and 3 numbers.
To compare the phonetic representation of arbitrary long character strings which can contain digits, the soundex code in BibConsist is modified in two points:
modified soundex code:
String similarity:
To determine the similarity of strings two methods are used:
Besides similarities, BibConsist checks if there is no multiply-defined citekey, if all citekeys in the fields precedes, succeeds and cites are defined and if no key in precedes, succeeds or cites points the entry itself. Moreover BibConsist tests if the title of books is defined in the field booktitle and in the field title.
Examples:
We have used BibConsist to check geombib (version march 1997) against itself. We have found only 69 pairs of inconsistent similar entries (not counting tech reports, thesis, etc.) and 49 citekey errors. The following examples for types of inconsistencies which are found with BibConsist are an extract of this check.
@inproceedings{aarx-clgta-96 , author = "Oswin Aichholzer and Franz Aurenhammer and G{\"u}nter Rote and Yin-Feng Xu" , title = "Constant-level greedy triangulations approximate the MWT well" , editor = "Ding-Zhu Du and Xiang-Sun Zhang and Kan Cheng" , booktitle = "Proc. Second Internat. Symp. Operations Research and its Applications, Guilin, China, December 11--13, 1996" , series = "Lecture Notes in Operations Research" , volume = 2 , publisher = "World Publishing Corp." , address = "Beijing" , year = 1996 , pages = "309--318" , precedes = "aarx-clgta-96" , update = "97.03 rote" }
@book{p-stces-93 , title = "Set Theoretic Constructions in {Euclidean} Spaces" , editor = "J. Pach" , booktitle = "New Trends in Discrete and Computational Geometry" , series = "Algorithms and Combinatorics" , volume = 10 , publisher = "Springer-Verlag" , year = 1993 , keywords = "discrete/computational geometry, book, survey papers" , comments = "contains gs-caa-93, s-barga-93, m-encg-93, k-cpvc-93, gp-asotd-93, km-hart-93, gpw-gtt-93, b-hlcpr-93, b-gcabt-93, fk-rrtpc-93, mp-rdcg-93, and k-stces-93" , update = "93.09 erickson" }
@book{s-asds-90 , author = "H. Samet" , title = "Applications of Spatial Data Structures" , publisher = "Addison-Wesley" , address = "Reading, MA" , year = 1990 , update = "97.03 schwarzkopf" } looks similar to @book{s-asdsc-90 , author = "H. Samet" , title = "Applications of Spatial Data Structures: Computer Graphics, Image Processing, and {GIS}" , publisher = "Addison-Wesley" , address = "Reading, MA" , year = 1990 , isbn = "0-201-50300-X" , keywords = "octrees" , update = "97.03 schwarzkopf, 93.09 held" }
@incollection{fs-amgfe-72 , author = "J. Fukuda and J. Suhara" , title = "Automatic Mesh Generation for Finite Element Analysis" , editor = "J. T. Oden and R. W. Clough and Y. Yamamoto" , booktitle = "Advances in Computational Methods in Structural Mechanics and Design" , publisher = "UAU Press" , address = "Hunstville, Alabama" , year = 1972 , annote = "Two phases. First randomly generates points in polygon to required density, then triangulates points by horribly complicated algorithm. Picks five points minimizing triangle edge length. Discards triangle intersecting or containing. Then picks point making this triangle and the next one as equilateral as possible." } looks similar to @incollection{sf-amgfe-72 , author = "J. Suhara and J. Fukuda" , title = "Automatic Mesh Generation for Finite Element Analysis" , editor = "J. T. Oden and R. W. Clough and Y. Yamamoto" , booktitle = "Advances in Computational Methods in Structural Mechanics and Design" , publisher = "UAU Press" , address = "Huntsville, AL" , year = 1972 , pages = "607--624" , annote = "Adds points to interior and then triangulates." }
@article{ngv-begs- , author = "M. H. Nodine and M. T. Goodrich and J. S. Vitter" , title = "Blocking for External Graph Searching" , journal = "Algorithmica" , note = "To appear" , update = "97.03 tamassia" } looks similar to @article{ngv-begs-96 , author = "M. H. Nodine and M. T. Goodrich and J. S. Vitter" , title = "Blocking for External Graph Searching" , journal = "Algorithmica" , volume = 16 , number = 2 , month = aug , year = 1996 , pages = "181--214" , update = "97.03 murali"
@inproceedings{dl-cvdrp-91 , author = "H. Djidjev and A. Lingas" , title = "On computing the {Voronoi} diagram for restricted planar figures" , booktitle = "Proc. 2nd Workshop Algorithms Data Struct." , series = "Lecture Notes Comput. Sci." , volume = 519 , publisher = "Springer-Verlag" , year = 1991 , pages = "54--64" , keywords = "Voronoi diagram, Delaunay triangulation, simple polygon, histogram" , precedes = "dl-cvdsp-95" , update = "96.09 devillers" } looks similar to @incollection{d-cvdrp-91 , author = "H. Djidjev" , title = "On computing the {Voronoi} diagram of restricted planar figures" , booktitle = "??" , series = "Lecture Notes Comput. Sci." , volume = 519 , year = 1991 , pages = "54--64" , keywords = "Voronoi diagram, lower bounds" , update = "95.09 korneenko" }
@book{o-cgc-94b , author = "J. O'Rourke" , title = "Computational Geometry in {C}" , publisher = "Cambridge University Press" , year = 1994 , update = "97.03 tamassia" } looks similar to @book{o-cgcfix-94 , author = "J. O'Rourke" , title = "Computational Geometry in {C}" , publisher = "Cambridge University Press" , year = 1994 , update = "97.03 tamassia" }
@book{t-dsna-83 , author = "R. E. Tarjan" , title = "Data Structures and Network Algorithms" , series = "CBMS-NSF Regional Conference Series in Applied Mathematics" , volume = 44 , publisher = "Society for Industrial Applied Mathematics" , year = 1983 , keywords = "graph drawing" , update = "93.09 tamassia" } looks similar to @book{t-dsna-87 , author = "R. E. Tarjan" , title = "Data Structures and Network Algorithms" , publisher = "Society for Industrial and Applied Mathematics" , address = "Philadelphia, PA" , year = 1987 }
@article{fpp-hdpgg-90 , author = "H. de Fraysseix and J. Pach and R. Pollack" , title = "How to Draw a Planar Graph on a Grid" , journal = "Combinatorica" , volume = 10 , year = 1990 , pages = "41--51" , keywords = "graph drawing" , update = "93.09 tamassia" } looks similar to @article{dpp-hdpgg-90 , author = "H. {De Fraysseix} and J. Pach and R. Pollack" , title = "How to draw a planar graph on a grid" , journal = "Combinatorica" , volume = 10 , number = 1 , year = 1990 , pages = "41--51" , keywords = "graph representation" , update = "95.09 korneenko" }
article{dv-cprac-77 , author = "A. K. Dewdney and J. K. Vranch" , title = "A convex partition of {$R^{3}$} with applications to {Crum}'s problem and {Knuth}'s post-office problem" , journal = "Utilitas Math." , volume = 12 , year = 1977 , pages = "193--199" } looks similar to article{dv-cpr3a-77 , author = "A. K. Dewdney and J. K. Vranch" , title = "Convex partition of $R^3$ with applicatton to {Crum's} problem and {Knuth's} post-office problem" , journal = "Utilitas Math." , volume = 12 , year = 1977 , pages = "193--199" , keywords = "Voronoi diagram, proximity, searching" , update = "95.09 korneenko" }
This program was originally developed for use with the computational geometry bibliographic database, but BibConsist can check any BibTEX file. The BibTEX database should contain the fields title, author, booktitle, journal, publisher, year, number, volume, pages and edition because BibConsist uses this fields to check the similarity of two entries. BibConsist ignores all fields which are unknown in geombib.
BibConsist is in the public domain and may be obtained by anonymos ftp from ftp.fernuni-hagen.de in the file pub/fachb/inf/pri6/BibRelEx/BibConsist/BibConsist.tar. You may use it or modify it to your heart's content, at your own risk. Bouquets, brickbats, and bug fixes may be sent to Britta Landgraf.
Abstract | Introduction | Research Status | Project Purpose and Scope | Data Base | BibConsist | References |
© Universität Bonn, Informatik Abt. I - webmaster - Letzte Änderung: Mon Oct 15 19:16:00 2001