SCIgenis apaper generatorthat usescontext-free grammarto randomly generatenonsensein the form ofcomputer scienceresearch papers.Its original data source was a collection of computer science papers downloaded fromCiteSeer.All elements of the papers are formed, including graphs, diagrams, andcitations.Created by scientists at theMassachusetts Institute of Technology,its stated aim is "to maximize amusement, rather than coherence."[1]Originally created in 2005 to expose the lack of scrutiny of submissions to conferences, the generator subsequently became used, primarily by Chinese academics, to create large numbers of fraudulent conference submissions, leading to the retraction of 122 SCIgen generated papers and the creation of detection software to combat its use.[2]
Repository | |
---|---|
Written in | Perl |
Available in | English |
Type | Paper generator |
License | GNU General Public License |
Website | http://pdos.csail.mit.edu/scigen/ |
Sample output
editOpeningabstractofRooter: A Methodology for the Typical Unification of Access Points and Redundancy:[3]
Many physicists would agree that, had it not been for congestion control, the evaluation of web browsers might never have occurred. In fact, few hackers worldwide would disagree with the essential unification of voice-over-IP and public/private key pair. In order to solve this riddle, we confirm that SMPs can be made stochastic, cacheable, and interposable.
Prominent results
editIn 2005, a paper generated by SCIgen,Rooter: A Methodology for the Typical Unification of Access Points and Redundancy,was accepted as a non-reviewed paper to the 2005World Multiconference on Systemics, Cybernetics and Informatics(WMSCI) and the authors were invited to speak. The authors of SCIgen described their hoax on their website, and it soon received great publicity when picked up bySlashdot.WMSCI withdrew their invitation, but the SCIgen team went anyway, renting space in the hotel separately from the conference and delivering a series of randomly generated talks on their own "track". The organizer of these WMSCI conferences is Professor Nagib Callaos. From 2000 until 2005, the WMSCI was also sponsored by theInstitute of Electrical and Electronics Engineers.[4]The IEEE stopped granting sponsorship to Callaos from 2006 to 2008.
Submitting the paper was a deliberate attempt to embarrass WMSCI, which the authors claim accepts low-quality papers and sends unsolicited requests for submissions in bulk to academics. As the SCIgen website states:
One useful purpose for such a program is to auto-generate submissions to conferences that you suspect might have very low submission standards. A prime example, which you may recognize from spam in your inbox, is SCI/IIIS and its dozens of co-located conferences (check out the very broad conference description on the WMSCI 2005 website).
— About SCIgen[5]
Computing writerStan Kelly-Bootlenoted inACM Queuethat many sentences in the "Rooter" paper were individually plausible, which he regarded as posing a problem for automated detection of hoax articles. He suggested that even human readers might be taken in by the effective use of jargon ( "The pun on root/router is par for MIT-graduate humor, and at least one occurrence of methodology is mandatory" ) and attribute the paper's apparent incoherence to their own limited knowledge. His conclusion was that "a reliable gibberish filter requires a careful holistic review by several peer domain experts".[6]
Schlangemann
editThepseudonym"Herbert Schlangemann" was used to publish fake scientific articles in international conferences that claimed to practicepeer review.The name is taken from the Swedish short filmDer Schlangemann.
- In 2008, in response to aseriesof Call-for-Papere-mails,SCIgen was used to generate a falsescientific papertitledTowards the Simulation of E-Commerce,using "Herbert Schlangemann" as the author. The article was accepted at the2008 International Conference on Computer Science and Software Engineering (CSSE 2008),co-sponsored by theIEEE,to be held inWuhan, China,and the author was invited to be a session chair on grounds of his fictionalCurriculum Vitae.[7]The official review comment: "This paper presents cooperative technology and classical Communication. In conclusion, the result shows that though the much-touted amphibious algorithm for the refinement of randomized algorithms is impossible, the well-known client-server algorithm for the analysis of voice-over-IP by Kumar and Raman runs in _(n) time. The authors can clearly identify important features of visualization of DHTs and analyze them insightfully. It is recommended that the authors should develop ideas more cogently, organizes them more logically, and connects them with clear transitions." The paper was available for a short time in theIEEEXplore Database, but was then removed. The entire story is described in the official "Herbert Schlangemann"blog,[8]and it also received attention inSlashdot[9]and the German-language technology-news site Heise Online.[10][11]
- In 2009, the same incident happened and Herbert Schlangemann's latest fake paperPlusPug: A Methodology for the Improvement of Local-Area Networkswas accepted for oral presentation at the2009 International Conference on e-Business and Information System Security (EBISS 2009),also co-sponsored byIEEE,to be held again inWuhan, China.[8]
In all cases, the published papers were withdrawn from the conferences' proceedings, and the conference organizing committee as well as the names of the keynote speakers were removed from their websites.
List of works with notable acceptance
editIn conferences
edit- Rob Thomas:Rooter: A Methodology for the Typical Unification of Access Points and Redundancy,2005 for WMSCI (see above)
- Mathias Uslar's paper was accepted to the IPSI-BG conference.[12]
- ProfessorGenco Gulanpublished a paper in the 3rd International Symposium of Interactive Media Design.[13]
- A 2013scientometricspaper demonstrated that at least 85 SCIgen papers have been published byIEEEandSpringer.[14]Over 120 SCIgen papers were removed according to this research.[15]
In journals
edit- Students at Iran'sSharif University of Technologypublished a paper inElsevier'sJournal of Applied Mathematics and Computation.[16]The students wrote under the surname "MosallahNejad", which translates literally fromPersian language(in spite of not being a traditionalPersian name) as "from an Armed Breed". The paper was subsequently removed when the publishers were informed that it was a joke paper.[17]
- Mikhail Gelfandpublished a translation of the "Rooter" article in the Russian-languageJournal of Scientific Publications of Aspirants and Doctorantsin August 2008. Gelfand was protesting against the journal, which was apparently not peer reviewed and was being used by Russian PhD candidates to publish in an "accredited"scientific journal, charging them 4000 Rubles to do so. The accreditation was revoked two weeks later.[18][19][20][21](SeeDissernetfor related information.)
- Springer Science+Business Mediaand IEEE were also the subject of similar pranks.
Spoofing Google Scholar andh-index calculators
editRefereeing performed on behalf of theInstitute of Electrical and Electronics Engineershas also been subject to criticism after fake papers were discovered in conference publications, most notably by Labbé and a researcher using the pseudonym ofSchlangemann.[22][23][24][25][26][27]
Cyril Labbé fromGrenoble Universitydemonstrated the vulnerability ofh-indexcalculations based onGoogle Scholaroutput by feeding it a large set of SCIgen-generated documents that were citing each other, effectively an academiclink farm,in a 2010 paper. Using this method the author managed to rank "Ike Antkare" ahead ofAlbert Einsteinfor instance.[28]
2013 retractions
editIn 2013, over 122 published conference papers created by SCIgen were retracted bySpringerand the IEEE. Unlike previous submissions that were intended to be pranks, this submission were largely made by Chinese academics, who were using SCIgen papers to boost their publication record.[29]
SciDetect
editIn 2015, SciDetect was released bySpringer.This software, developed by Cyril Labbé, is designed to automatically detect papers generated by SCIgen.[2]
2021 report
editIn 2021, a study was published on 243 SCIgen papers that had been published in the academic literature. They found that SCIgen papers made up 75 per million papers (<0.01%) in information science, and that only a small fraction of the detected papers had been dealt with.[30][31]
See also
edit- Academic conference
- Bogdanov Affair
- Derailment (thought disorder)
- Grievance studies affair
- Infinite monkey theorem
- List of scholarly publishing hoaxes
- Paper generator
- Parody generator
- Postmodernism Generator
- Sokal affair
- The Engine
- Turing test
- Get me off your fucking mailing list
- Who's Afraid of Peer Review?
References
edit- ^SCIgen - An Automatic CS Paper Generator
- ^abBohannon, John (2015-03-27)."Hoax-detecting software spots fake papers".Science | AAAS.Retrieved2020-09-28.
Rather than being created as pranks, it seems that many of the fake papers were coming from China where they were "bought by academics and students" to pad their publication records, says the lead researcher behind the investigation, Cyril Labbé, a computer scientist at Joseph Fourier University in Grenoble, France.
- ^Stribling, Jeremy; Aguayo, Daniel; Krohn, Maxwell."Rooter: A Methodology for the Typical Unification of Access Points and Redundancy"(PDF).
- ^Heinrich Zankl:Der Science-Generator- ein geniales Publikationsprogramm.In W.Hömberg, E.Roloff (Herausgeber):Jahrbuch der Marginalistik IV:Lit-Verlag. Münster. 2016 S. 60–67.ISBN978-3-643-99793-7
- ^"SCIgen - An Automatic CS Paper Generator".MIT.
- ^Stan Kelly-Bootle(July–August 2005)."Call that gibberish?".ACM Queue.3(6): 64.doi:10.1145/1080862.1080884.
- ^"CSSE Conference Program"(PDF).
- ^ab"The official Herbert Schlangemann Blog, The whole story behind the paper" Towards the Simulation of E-Commerce "".
- ^kdawson (December 24, 2008)."Software-Generated Paper Accepted At IEEE Conference".Slashdot.VA Linux Systems Japan.RetrievedMay 5,2009.
- ^Peter-Michael Ziegler (December 26, 2008)."Dr. Herbert Schlangemann - oder die Geschichte eines pseudowissenschaftlichen Nonsens-Papiers (in German)".Heise Online.Heise Zeitschriften Verlag.RetrievedMay 5,2009.
- ^Heise Onlinewebpage (in German)
- ^"Mathias Uslar's paper".Archived fromthe originalon 2009-06-15.
- ^"About Genco Gulan's paper".
- ^"Duplicate and Fake Publications in the Scientific Literature: How many SCIgen papers in Computer Science?"(PDF).Hal.archives-ouvertes.fr.Retrieved2014-05-15.
- ^"Publishers withdraw more than 120 gibberish papers".Nature.24 February 2014.Retrieved25 February2014.
- ^Rohollah Mosallahnezhad."Cooperative, Compact Algorithms for Randomized Algorithms"(PDF).Archived fromthe original(PDF)on 2009-12-29.
- ^Rohollah Mosallahnezhad (2007), "REMOVED: Cooperative, compact algorithms for randomized algorithms",Applied Mathematics and Computation,doi:10.1016/j.amc.2007.03.011
- ^"Mon ordinateur écrit mieux que le tien!".Agence Science-Presse(in French). Canada. 8 September 2009.Retrieved4 October2011.
- ^"Rooter invades Russia".SCIgen.8 January 2009. Archived fromthe originalon 2014-04-03.Retrieved4 October2011.
- ^Malozemov, Sergei (7 October 2008).Группа отечественных ученых поставила эксперимент — смешала сложные термины случайным образом, а полученный текст отослала в один из научных журналов.NTV(in Russian).Retrieved4 October2011.
- ^"Feedback".New Scientist.15 August 2009.
- ^Labbé, Cyril; Labbé, Dominique (2013)."Duplicate and fake publications in the scientific literature: how many SCIgen papers in computer science?".Scientometrics.94(1): 379–396.doi:10.1007/s11192-012-0781-y.S2CID6889400.
- ^Oransky, Ivan (February 24, 2014)."Springer, IEEE withdrawing more than 120 nonsense papers".retractionwatch.WordPress.RetrievedApril 29,2014.
- ^de Gloucester, Paul Colin (2013)."Referees Often Miss Obvious Errors in Computer and Electronic Publications".Accountability in Research: Policies and Quality Assurance.20(3): 143–166.Bibcode:2013ARPQ...20..143D.doi:10.1080/08989621.2013.788379.PMID23672521.S2CID42975675.
- ^Dawson, K. (December 23, 2008)."Software-Generated Paper Accepted At IEEE Conference".slashdot.org.Dice.RetrievedApril 29,2014.
- ^Hatta, Masayuki (December 24, 2008)."IEEEカンファレンス, tự động sinh thành の ニセ luận văn をアクセプト".slashdot.jp(in Japanese). OSDN Corporation.RetrievedApril 29,2014.
- ^Ziegler, Peter-Michael (December 26, 2008)."Dr. Herbert Schlangemann - oder die Geschichte eines pseudowissenschaftlichen Nonsens-Papiers".heise.de(in German). Heise Zeitschriften Verlag.RetrievedApril 29,2014.
- ^"Les rapports de recherche du LIG"(PDF).Rr.liglab.fr.Retrieved2014-05-15.
- ^Van Noorden, Richard (2014)."Publishers withdraw more than 120 gibberish papers".Nature News.doi:10.1038/nature.2014.14763.
- ^Cabanac, Guillaume; Labbé, Cyril (2021-05-25)."Prevalence of nonsensical algorithmically generated papers in the scientific literature".Journal of the Association for Information Science and Technology.72(12): 1461–1476.doi:10.1002/asi.24495.ISSN2330-1635.S2CID236374033.
- ^Noorden, Richard Van (2021-05-27)."Hundreds of gibberish papers still lurk in the scientific literature".Nature.594(7862): 160–161.Bibcode:2021Natur.594..160V.doi:10.1038/d41586-021-01436-7.PMID34045760.S2CID235232305.
Further reading
edit- Ball, Philip (2005)."Computer conference welcomes gobbledegook paper".Nature.434(7036): 946.Bibcode:2005Natur.434..946B.doi:10.1038/nature03653.PMID15846311.
- kdawson (24 December 2008)."Software-Generated Paper Accepted At IEEE Conference".Slashdot.VA Linux Systems Japan.Retrieved5 May2009.
- Peter-Michael Ziegler (26 December 2008)."Dr. Herbert Schlangemann - oder die Geschichte eines pseudowissenschaftlichen Nonsens-Papiers (in German)".Heise Online.Heise Zeitschriften Verlag.Retrieved5 May2009.