MIT Robot Detects Fake Scientific Papers

Started by TehBorken, Apr 25 06 08:24

Previous topic - Next topic

TehBorken

Ever wondered whether a scientific paper was actually written by a robot? Of course you have. Science To The Rescuse:  a [a href="http://www.newscientist.com/blog/technology/2006/04/fake-paper-detector.html"]new program[/a] developed by researchers at Indiana University promises to tell you one way or the other. It was actually developed in response to [a href="http://www.newscientisttech.com/channel/tech/mg18624963.700.html"]a prank[/a] by MIT researchers who generated a paper from random bits of text and got it accepted for a conference.
[hr style="width: 100%; height: 2px;"][a onblur="try (parent.deselectBloggerImageGracefully();) catch(e) ()" href="http://www.newscientist.com/blog/technology/uploaded_images/FAKEDE%7E1-780877.gif"][img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://www.newscientist.com/blog/technology/uploaded_images/FAKEDE%7E1-777868.gif" alt="" border="0"][/a]You may remember the story of some cheeky MIT students who wrote a computer programme to [a href="http://www.newscientisttech.com/channel/tech/mg18624963.700.html"]generate scientific papers[/a]. Well, now some researchers at the [a href="http://www.informatics.indiana.edu/"]Indiana University School of Informatics[/a] have come up with an [a href="http://montana.informatics.indiana.edu/fsi/about.html"]Inauthentic Paper Detector[/a] to foil it.

Mehmet Dalkilic, a data mining expert explains how it works: "We believe that there are subtle, short- and long-range word or even word string repetitions that exist in human texts, but not in many classes of computer-generated texts that can be used to discriminate based on meaning."

You can generate a random computer science paper of your own over [a href="http://pdos.csail.mit.edu/scigen/"]here[/a], and then see if you can slip it past the Inauthentic Paper Detector [a href="http://montana.informatics.indiana.edu/cgi-bin/fsi/fsi.cgi"]here[/a].

I had a bit of a Bladerunner moment just now, when it classified [a href="http://www.newscientisttech.com/article/dn9047-roboturtle-answers-some-flippery-questions.html"]this article[/a] I wrote yesterday as 'INAUTHENTIC' with just 32.1% chance of being written by a human. I'm hoping it's down to the system being designed to work on technical articles...

The fake MIT paper was given a 21.5% probabilty of being authentic. Meanwhile, [a href="http://en.wikipedia.org/wiki/Hwang_Woo_Suk#Lifestyle"]Hwang Woo-Suk[/a]'s 2005 paper in which he made [a href="http://www.newscientist.com/channel/sex/dn8557.html"]fraudulent[/a] claims to have cloned 11 lines of embryonic stem cells, comes up as 'AUTHENTIC', with only a 4.9% chance of being fake. I doubt that anyone will ever write a program to detect that kind of chicanery.
The real trouble with reality is that there's no background music.

Good day

[TABLE id=HB_Mail_Container height="100%" cellSpacing=0 cellPadding=0 width="100%" border=0 un-selectable="on"] [TBODY] [TR height="100%" un-selectable="on" width="100%"] [TD id=HB_Focus_Element vAlign=top width="100%" background="" height=250 un-selectable="off"] It sounds like someone made a program that works in reverse just submit your inauthenic documents to see if they have a high probability of fooling them.[/DIV]

 [/TD][/TR] [TR un-selectable="on" hb_tag="1"] [TD style="font-size: 20pt" height=1 un-selectable="on"] [DIV id=hotbar_promo][/TD][/TR][/TBODY][/TABLE]

TehBorken

 Good day wrote:
It sounds like someone made a program that works in reverse just submit your inauthenic documents to see if they have [div style="font-style: italic;"]a high probability of fooling them.[/div]
Hey f*ckwad, stop trying to sneak your shit-ass "[span style="font-style: italic; font-weight: bold;"]Hotbar Promo Code[/span]" crap in the pages. Asswipe.
The real trouble with reality is that there's no background music.