This is a nice little paper on discovering similar papers without having to trawl through the literature. The most common technique is to look at papers citing any given paper, but this has problems of favouring highly cited papers and missing those that slip through the nets. Occasionally, papers languish in the doldrums for years before people suddenly notice them. And if someone is new to a particular field, they might not necessarily cite the expected literature. Often citations just reference a very specific aspect of a paper and actually focus on very different issues, so there's really no point in reading them if you wanted something similar. Basically the citation network is limited.
The method here uses some basic searches for keywords and their importance, which is measured (I think) by the fraction of the word count. It then trawls the literature (they downloaded the whole or arXiv for this !) for papers with similar keywords of similar importance. They say that this method, which is open and public, gives better results than the similar (closed) method now available on NASA ADS. It sounds like a useful way to discover new and related papers, especially ones that might not be well cited, when starting work in a slightly different sub-topic.
Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby
Subscribe to:
Post Comments (Atom)
Review : Viking Britain
Hot on the heels of Neil Price's Children of Ash and Elm comes Thomas William's Viking Britain . Given how much I enjoyed his Lost...
-
Hmmm. [The comments below include a prime example of someone claiming they're interested in truth but just want higher standard, where...
-
Where Americans think Ukraine is These are the guesses of 2066 Americans as to where Ukraine is. Only 1 in 6 were correct. Presumably the...
-
I've noticed that some people care deeply about the truth, but come up with batshit crazy statements. And I've caught myself rationa...
I apply Solr to exactly this sort of problem.
ReplyDeletelucene.apache.org - Apache Solr -