This is a nice little paper on discovering similar papers without having to trawl through the literature. The most common technique is to look at papers citing any given paper, but this has problems of favouring highly cited papers and missing those that slip through the nets. Occasionally, papers languish in the doldrums for years before people suddenly notice them. And if someone is new to a particular field, they might not necessarily cite the expected literature. Often citations just reference a very specific aspect of a paper and actually focus on very different issues, so there's really no point in reading them if you wanted something similar. Basically the citation network is limited.
The method here uses some basic searches for keywords and their importance, which is measured (I think) by the fraction of the word count. It then trawls the literature (they downloaded the whole or arXiv for this !) for papers with similar keywords of similar importance. They say that this method, which is open and public, gives better results than the similar (closed) method now available on NASA ADS. It sounds like a useful way to discover new and related papers, especially ones that might not be well cited, when starting work in a slightly different sub-topic.
http://adsabs.harvard.edu/abs/2017arXiv170505840K
Sister blog of Physicists of the Caribbean in which I babble about non-astronomy stuff, because everyone needs a hobby
Subscribe to:
Post Comments (Atom)
Whose cloud is it anyway ?
I really don't understand the most militant climate activists who are also opposed to geoengineering . Or rather, I think I understand t...
-
"To claim that you are being discriminated against because you have lost your right to discriminate against others shows a gross lack o...
-
For all that I know the Universe is under no obligation to make intuitive sense, I still don't like quantum mechanics. Just because some...
-
Hmmm. [The comments below include a prime example of someone claiming they're interested in truth but just want higher standard, where...
I apply Solr to exactly this sort of problem.
ReplyDeletelucene.apache.org - Apache Solr -