Shortcut: WD:WikiProject Source
Wikidata:WikiCite
WikiCite WikiProject (formerly WikiProject Source Metadata)
The aim of WikiCite is:
- to act as a hub for work in Wikidata involving citation data and bibliographic data as part of the broader WikiCite initiative.
- to define a set of properties that can be used by citations, infoboxes, and Wikisource.
- to map and import all relevant metadata that currently is spread across Commons, Wikipedia, and Wikisource.
- establish methods to interact with this metadata from different projects.
- to create a large open bibliographic database within Wikidata.
- to reveal, build, and maintain community stakeholdership for the inclusion and management of source metadata in Wikidata.
There have been various proposals over the years for similar projects (see meta:WikiCite for details). Now that Wikidata is here, we can make it happen.
Current activities
[edit]- User:Research Bot has imported metadata for millions of academic works. Some data is stored as strings (such as author names)
- Go to Scholia, search on a name for an author, go to that page, append "/missing" to the URL (e.g., https://scholia.toolforge.org/author/Q6371926/missing) and then start resolving author strings. There are "/missing" pages for topics as well, e.g. as documented here, and venues have them too (example).
- Go to https://author-disambiguator.toolforge.org, look for an author name and create an item + link existing work items. See Organizing the public data about a researcher.
- Use Wikidata lists to curate sets of items, e.g. Journals without main subject or Works with famous authors but no publication date or Famous people with unidentified co-authors or Semi-disambiguated authors.
- Add main subjects to all scientific articles. So9q has written python command line tool that enables you to semi-automatically add main subjects.
Past activities
[edit]- A WikiCite Roadmap of the future of bibliographic data in Wikidata has been discussed
- so9q worked on a python bot to follow the Wikimedia eventstream and list DOIs and ISBNs found in Wikipedia articles mentioned in the stream. Next step it to import them if the bot request is approved..
Ongoing imports
[edit]- Citations between works are being imported by Citation graph bot 2; (no edits since March 2021)
- Researchers and works from the University of Leeds (Q503424) are being imported by Sic19;
- Daniel Mietchen is adding main subject (P921) to publications (this is not really an import, since we do not have appropriate tooling, e.g. as discussed here);
- Mahdimoqri is importing journals from Crossref (Q5188229) (dataset import page);
- John Cummings is importing journals from Directory of Open Access Journals (Q1227538) (dataset import page);
- valhallasw is importing arXiv ID (P818) using data from Unpaywall (Q38352586) (dataset import page). Help is welcome!
- Sic19 is adding education and employment data from ORCID to existing items that lack this information.
- Kpjas is importing titles and labels (in Polish) of scientific articles that have been published in Polish scientific journals (currently: Wiadomosci lekarskie (Warsaw, Poland : 1960) (Q27713857) and Psychiatria Polska (Q11830678), more to follow). PROBLEM: changing ranks of titles, Quickstatements won't.
- Alessandro Marchetti (WMCH) with Researchers in Switzerland is slowly checking all authors. Moslty manual, with imported refinement.
- Alexmar983, Bargioni, Epìdosis (as part of the Gruppo Wikidata per Musei, Archivi e Biblioteche (GWMAB) group) are the main users active on the topic in Italy, see Wikidata:WikiCite/Italy
- The GWMAB is working, mostly manually, on the import of items of researchers affiliated to Italian universities (see project IRIS) which is the main focused driving force of the import strategy in Italy
Properties
[edit]See this subpage for more details.
Projects
[edit]- bib2wikidata is a tool being designed to import bibliographic data in Citation Style Language (Q824708) format into Wikidata.
- Open Access Signalling
- Citoid
- ORCID integration
- Proposed citation microformat for Wikipedias and other project content pages
- ProveIt - A powerful gadget that adds a visual interface to manage references, which is on its way to being connected to Wikidata
- Wikidata:Scholia, Scholia — A web service that displays scholarly profile pages based on Wikidata.
Wikidata:SourceMD, SourceMD - Import "source MetaData" via DOI/PMID/PMCIDunder repair- Fatameh - Import items to Wikidata about papers with content generated from places like EuropePMC, PubMed and Crossref.
- zotkat - Zotero export to QuickStatements format for importing in Wikidata
- Wikidata:WikiProject Source MetaData/Citation Typing Ontology - about CiTO annotations in Wikidata
Examples
[edit]Here is an example that creates a reference list with the articles
- Annotated checklist of the recent and extinct pythons (Serpentes, Pythonidae), with notes on nomenclature, taxonomy, and distribution (Q14405740)
- Tectonic map and overall architecture of the Alpine orogen (Q13416617)
- Uffenbach, Zacharias Konrad von (ADB) (Q20058533)
- Ecological guild evolution and the discovery of the world's smallest vertebrate (Q15567682)
based on the following code:
- {{#invoke:Cite | reflist | Q14405740 Q13416617 Q20058533 Q15567682 }}
This results in
Q21694395 →
Wobst R. Cryptology Unlocked — Chichester: John Wiley & Sons Ltd, 2007. — 554 p. — ISBN 978-0-470-06064-3Q21694386 →
Luciano D., Prichett G. Cryptology: From Caesar Ciphers to Public-Key Cryptosystems // The College Mathematics Journal — MAA, Taylor & Francis, 1987. — Vol. 18, Iss. 1. — P. 2–17. — ISSN 0746-8342; 1931-1346 — doi:10.2307/2686311
Tasks
[edit]For a list of specific tasks and todos (missing data, missing properties, cleanup tasks) see /ToDo
Workflow for profiling researchers
[edit]How to create a scholarly profile for a researcher in Wikidata
- Consider the platform
- Visit Wikidata
- Wikidata is the database which anyone can edit
- The Wikidata community curates this data
- Consider Wikicite
- Wikicite is the community project within Wikidata which curates source metadata
- The Wikicite community is a subset of the Wikidata community
- Consider how anyone accesses data
- Scholia is the specialized Wikidata tool for viewing academic profiles of people, topics, universities, etc
- If a profile looks good in Scholia, then the data is correctly formatted to be maximally open and accessible in Wikidata and the Semantic Web
- Making a profile look good in Scholia is the quickest and easiest way to format data once and for all
- the Wikidata Query Service is the general Wikidata tool for viewing groups of Wikidata content
- Everyone else, including big tech, big publishing, big government, etc scrape Wikidata and reuse this content, so what is in Wikidata goes everywhere else
- Scholia is the specialized Wikidata tool for viewing academic profiles of people, topics, universities, etc
- Visit Wikidata
- Identify or create the Wikidata item for the researcher to profile
- use basic Wikidata search by the person's name
- if the item for the person exists, then use it
- if the item does not exist, then create it
- follow the instructions for creating a profile for a human in Wikidata:WikiProject Biographies
- add enough information to uniquely identify this person by name and a few other characteristics
- If there is ambiguity because multiple people have the same name and characteristics, then create a new item. Items can be merged, and merging duplicates is easier to fix than separating mixed items.
- use basic Wikidata search by the person's name
- Try to add the ORCID, which is a unique scientific identifier
- visit https://orcid.org/
- search for the researcher
- if there is an easy and obvious match, then grab the ORCID
- go back to Wikidata
- click "add statement", enter ORCID, paste the ORCID, publish
- run ORCIDator, a Wikidata tool to import ORCID data into Wikidata
- Access through the SourceMD tool - https://www.wikidata.org/wiki/Wikidata:SourceMD
- further documentation at https://www.wikidata.org/wiki/Wikidata:ORCIDator
- there is often no ORCID, or the ORCID is blank, or there is ambiguity - pass if this is the case
- Use the "Wikidata Author Disambiguation tool"
- This will match papers indexed in Wikidata to the target researcher
- https://tools.wmflabs.org/author-disambiguator/
- Enter the target researcher's name
- in 2019 the tool is clunky
- try name variations, including initials, or whatever is likely in an academic paper
- Identify name variations
- go back to the Wikidata item for the person
- add the variations to the "also known as" field at the top of the item
- noting the variations greatly assists ongoing maintenance and profile updates
- Enter the target researcher's name
- Wait
- Wikidata is a nonprofit project of the Wikimedia Community
- technical infrastructure is modest; in 2019 updates typically take 5-30 minutes
- Like Wikipedia, Wikidata depends on volunteer contributors of content and donor funding
- thanks for editing, it is the most valuable contribution anyone can make
- View incomplete profile on Scholia
- enter the person's name - it should autocomplete
- profile generated based on available data
- use Scholia's "missing content" tool
- this is weird - access by adding "/missing" at the end of the scholia URL
- the missing tool is actually a collection of tools which search and suggest possible data to add to the profile
- building out the network of collaborators is easy from here
- consider building profiles for top co-authors
- consider building profiles for people who commonly cite target researcher's papers
Possible Data Collaborators
[edit]Some possible Data Collaborators have expressed interest on working on source metadata in Wikidata: others might usefully be approached.
OCLC, which runs WorldCat, is very keen on collaborating with Wikidata; User:Maximiliankleinoclc wrote a letter about the possibilities.
ContentMine has some excellent open software tools, which we could use to let Wikidata answer queries like "List all the review papers ever written on malaria vaccines", "List all the articles that mention Lygodactylus williamsii", "List every paper ever written by John Tuzo Wilson" and "List all the papers cited in Wikipedia articles that have been retracted". They listed "An Open Bibliography of science, updated daily" as a "wikiwish" at Wikimania 2014, apparently unaware that this project has been started at a slightly earlier workshop.
PLOS has an API for RichCitations, which contains metadata on all PLOS papers up through late 2014. Rich Citations is a novel structured format to express each citation as a data element, and it includes a set of useful, additional terms specific to scholarly literature that enable research about the knowledge web citations create. It also includes a display feature much like Reference Tooltips, but linked to a database (which is open licensed), so it can update metainformation. They presented at Wikimania 2014 and are keen to collaborate and share their results with us.
Zotero is interested in the idea of a proofread metadata source. Some Zotero users currently upload to cloud storage; we might build tools to let them upload here, instead. CiteseerX has a large open-licensed database of article metadata, and might want to set up an exchange, but have not responded to e-mails.
The Cochrane Collaboration is developing an API to its metadata (they were contacted about this project in July 2014, so this use case may have helped shape the API). They produce large amounts of non-conventional metadata on works they review, and on works they produce, both of which Wikimedians quote.
Institutional repositories are also increasingly interested in open APIs and linked databases, and seem generally receptive to this project. The university-run academic search engine BASE aggregates and normalizes these repositories and makes its data collection available for non-commercial purposes.
The french ministry of research and teaching developped an open science barometer by harvesting various source of bibliometric data.
- The method has been published here : https://direct.mit.edu/qss/article/3/1/18/109245/Identifying-scientific-publications-countrywide
- The dataset is under CC licence but not CC0, so probably not compatible.
- The codes (including data harvesting) is on github https://github.com/orgs/dataesr
Resources
[edit]Statistics
[edit]Of the 41,458,756 items which are instance of (P31) of scholarly article (Q13442814) :
With property | Without property | Coverage | |
---|---|---|---|
DOI (P356) | 29,596,004 | 11,862,752 | 71.4% |
PubMed publication ID (P698) | 32,036,547 | 9,422,209 | 77.3% |
main subject (P921) | 17,457,870 | 24,000,886 | 42.1% |
language of work or name (P407) | 16,793,065 | 24,665,691 | 40.5% |
author (P50) | 11,606,828 | 29,851,928 | 28.0% |
author name string (P2093) | 38,760,042 | 2,698,714 | 93.5% |
Only author (P50) , no author name string (P2093) | 1,267,160 | 40,191,596 | 3.1% |
Only author name string (P2093) , no author (P50) | 28,420,374 | 13,038,382 | 68.6% |
Updated: 11 March 2024.
- Please see More statistics
Subpages
[edit]The following subpages belong to the project:
- WikiCite/Academic Journals
- WikiCite/ArticlePlaceholder
- WikiCite/Bibliographic metadata for scholarly articles in Wikidata
- WikiCite/Citation Typing Ontology
- WikiCite/Colombia
- WikiCite/Conferences
- WikiCite/Country-level initiatives
- WikiCite/Dagbani
- WikiCite/Indexes
- WikiCite/Italy
- WikiCite/Journals by language
- WikiCite/Journals by publisher
- WikiCite/Literature surveys
- WikiCite/Manual import
- WikiCite/More/Participants
- WikiCite/Participants
- WikiCite/Refinements
- WikiCite/Refinements/Italy
- WikiCite/Refinements/Switzerland
- WikiCite/Research Journals in Switzerland
- WikiCite/Research in Switzerland
- WikiCite/Research repositories in Switzerland
- WikiCite/Research repositories in Switzerland/Graduate Institute
- WikiCite/Researchers in Switzerland
- WikiCite/Researchers in Switzerland/Contacts
- WikiCite/Researchers in Switzerland/Imports and reconciliations by Identifiers
- WikiCite/Researchers in Switzerland/Imports and reconciliations by webpages
- WikiCite/Researchers in Switzerland/Queries
- WikiCite/Researchers in Switzerland/Statistics
- WikiCite/Researchers in Switzerland/homonyms
- WikiCite/Roadmap
- WikiCite/SWAT4LS 2017 tutorial
- WikiCite/Source types
- WikiCite/Standards
- WikiCite/Statistics
- WikiCite/Statistics/Summary
- WikiCite/Switzerland
- WikiCite/Theses and dissertations
- WikiCite/Theses by institution
- WikiCite/ToDo
- WikiCite/Tool audit
- WikiCite/Wikidata lists
- WikiCite/Wikidata lists/Author name strings matched to author items using Stated As
- WikiCite/Wikidata lists/Author name strings that are on multiple papers with at least three identical co-authors
- WikiCite/Wikidata lists/Author name strings that are on multiple papers with at least three identical co-authors (by author name string)
- WikiCite/Wikidata lists/Author name strings that are on multiple papers with at least three identical co-authors (by co-authors)
- WikiCite/Wikidata lists/Authors missing P108 (employer) or P69 (educated at) or P184 (doctoral advisor)
- WikiCite/Wikidata lists/Authors of multiple works that do not have a main subject statement
- WikiCite/Wikidata lists/Authors with ORCID
- WikiCite/Wikidata lists/Corporate authors
- WikiCite/Wikidata lists/Famous people with unidentified co-authors
- WikiCite/Wikidata lists/Items about HTLV-1
- WikiCite/Wikidata lists/Items about RNA-Seq
- WikiCite/Wikidata lists/Items about Zika virus or fever
- WikiCite/Wikidata lists/Items about earthquake
- WikiCite/Wikidata lists/Items about heterosis
- WikiCite/Wikidata lists/Items citing other items
- WikiCite/Wikidata lists/Items missing JSTOR ID
- WikiCite/Wikidata lists/Items missing JSTOR ID/raw
- WikiCite/Wikidata lists/Items missing JSTOR ID/row
- WikiCite/Wikidata lists/Items with arXiv IDs
- WikiCite/Wikidata lists/Journals without main subject
- WikiCite/Wikidata lists/Long author name strings used on multiple works
- WikiCite/Wikidata lists/Main subjects of publications with unidentified authors
- WikiCite/Wikidata lists/Periodicals with no main subject statement
- WikiCite/Wikidata lists/Popular author name strings on works about taxa
- WikiCite/Wikidata lists/Popular author name strings on works about topics that have a Disease Ontology ID
- WikiCite/Wikidata lists/Popular author name strings on works about topics that have a geolocation
- WikiCite/Wikidata lists/Popular strings in titles of works with a given author name string
- WikiCite/Wikidata lists/Publications where all authors have an ORCID
- WikiCite/Wikidata lists/Published today
- WikiCite/Wikidata lists/Scholarly journals
- WikiCite/Wikidata lists/Semi-disambiguated authors
- WikiCite/Wikidata lists/Unidentified authors on publications about the Zika virus
- WikiCite/Wikidata lists/Usage of Scholia in Template Medical resources on the English Wikipedia
- WikiCite/Wikidata lists/Usage of Template Scholia
- WikiCite/Wikidata lists/Usage of Template Scholia/Basque Wikipedia
- WikiCite/Wikidata lists/Usage of Template Scholia/Cross-wiki
- WikiCite/Wikidata lists/Usage of Template Scholia/Cross-wiki/List
- WikiCite/Wikidata lists/Usage of Template Scholia/Cross-wiki/Query
- WikiCite/Wikidata lists/Usage of Template Scholia/English Wikipedia
- WikiCite/Wikidata lists/Usage of Template Scholia/English Wikipedia/Query
- WikiCite/Wikidata lists/Usage of Template Scholia/English Wikipedia/With CC0 images
- WikiCite/Wikidata lists/Usage of Template Scholia/English Wikisource
- WikiCite/Wikidata lists/Usage of Template Scholia/Macedonian Wikipedia
- WikiCite/Wikidata lists/Usage of Template Scholia/Malayalam Wikipedia
- WikiCite/Wikidata lists/Usage of Template Scholia/Swedish Wikipedia
- WikiCite/Wikidata lists/Works co-authored by famous people and unidentified authors
- WikiCite/Wikidata lists/Works that are referenced and cited but do not have a main subject
- WikiCite/Wikidata lists/Works with famous authors but no main subject statement
- WikiCite/Wikidata lists/Works with famous authors but no publication date
- WikiCite/Wikidata lists/Works with missing statements for both main subject and publication date
- WikiCite/ar
- WikiCite/da
- WikiCite/de
- WikiCite/en
- WikiCite/es
- WikiCite/fr
- WikiCite/id
- WikiCite/it
- WikiCite/ja
- WikiCite/ms
- WikiCite/nl
- WikiCite/nonexistent PMIDs
- WikiCite/nonexistent PMIDs/1-10000000
- WikiCite/nonexistent PMIDs/10000000-14000000
- WikiCite/nonexistent PMIDs/14000000-17000000
- WikiCite/nonexistent PMIDs/17000000-20000000
- WikiCite/nonexistent PMIDs/20000000-22000000
- WikiCite/nonexistent PMIDs/22000000-24000000
- WikiCite/nonexistent PMIDs/24000000-27000000
- WikiCite/nonexistent PMIDs/27000000-30000000
- WikiCite/nonexistent PMIDs/30000000-
- WikiCite/pl
- WikiCite/ru
- WikiCite/tr
- WikiCite/uk
- WikiCite/ur
- WikiCite/zh
Contact
[edit]- IRC channel: #wikidata-wpsm (webchat now!) and #wikicite (webchat now!)
- Mailing list: wikicite-discuss
- Twitter: search for wikicite #WikiCite or @WikiCite
Participants
[edit]The first list has now reached the maximum number possible for {{Ping project}}
. Please therefore add your name to the second list below.
To ping both lists, use {{Ping project|Source MetaData}}
and {{Ping project|Source MetaData/More}}
in two different posts.
{{Ping project|WikiCite}}
- Mattsenate (talk) 13:11, 8 August 2014 (UTC)
- KHammerstein (WMF) (talk) 13:15, 8 August 2014 (UTC)
- Mitar (talk) 13:17, 8 August 2014 (UTC)
- Mvolz (talk) 18:07, 8 August 2014 (UTC)
- Daniel Mietchen (talk) 18:09, 8 August 2014 (UTC)
- Merrilee (talk) 13:37, 9 August 2014 (UTC)
- Pharos (talk) 14:09, 9 August 2014 (UTC)
- DarTar (talk) 15:46, 9 August 2014 (UTC)
- HLHJ (talk) 09:11, 11 August 2014 (UTC)
- Blue Rasberry 18:02, 11 August 2014 (UTC)
- JakobVoss (talk) 12:23, 20 August 2014 (UTC)
- Finn Årup Nielsen (fnielsen) (talk) 02:06, 23 August 2014 (UTC)
- Jodi.a.schneider (talk) 09:24, 25 August 2014 (UTC)
- Abecker (talk) 23:35, 5 September 2014 (UTC)
- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:21, 24 October 2014 (UTC)
- Mike Linksvayer (talk) 23:26, 18 October 2014 (UTC)
- Kopiersperre (talk) 20:33, 20 October 2014 (UTC)
- Jonathan Dugan (talk) 21:03, 20 October 2014 (UTC)
- Hfordsa (talk) 19:26, 5 November 2014 (UTC)
- Vladimir Alexiev (talk) 15:09, 23 January 2015 (UTC)
- Runner1928 (talk) 03:25, 6 May 2015 (UTC)
- Pete F (talk)
- econterms (talk) 13:51, 19 August 2015 (UTC)
- Sj (talk)
- TomT0m
- addshore 17:43, 18 January 2016 (UTC)
- Bodhisattwa (talk) 16:08, 29 January 2016 (UTC)
- Ainali (talk) 16:51, 29 January 2016 (UTC)
- Shani Evenstein (talk) 21:29, 5 July 2018 (UTC)
- Skim (talk) 07:17, 6 November 2018 (UTC)
- PKM (talk) 23:19, 19 November 2018 (UTC)
- Ocaasi (talk) 22:19, 29 November 2018 (UTC)
- Trilotat Trilotat (talk) 15:43, 16 February 2019 (UTC)
- Iwan.Aucamp
- Alessandra Boccone
- Pablo Busatto (talk) 05:40, 23 June 2020 (UTC)
- Blrtg1 (talk) 17:20, 23 July 2020 (UTC)
- Kosboot (talk) 21:32, 23 July 2020 (UTC)
- Matlin (talk) 09:38, 11 August 2020 (UTC)
- Carrierudd(talk) 11:44, 3 November 2020 (UTC)
- So9q (talk) 11:35, 16 January 2021 (UTC)
- pdesai (talk) 16:00, 8 February 2021 (UTC)
Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 18:43, 17 May 2021 (UTC) - Joeyvandernaald (talk) 19:13, 8 October 2023 (UTC)
- Maxime
- CorraleH (talk)
Historical discussions
[edit]There have been historical discussions about Wikidata hosting information about the sources of data.
- Wikidata:Requests_for_comment/Source_items_and_supporting_Wikipedia_sources - Confirmed that sources could be typical items in Wikidata, and that a separate sort of item only for sources ("S" instead of "Q") need not be created as a space only for sources
- Wikidata:Requests for comment/Sourcing requirements for bots - Confirmed that bots set standards for best practices, and that claims made through bots are best backed with sources.
- Wikidata:Requests_for_comment/References_and_sources - Confirmed Help:Sources as an official project guideline
- m:Wikidata/Notes/Bibliographic data
- Wikipedia as the front matter to all research - Discussed role of Wikidata in handling citations across Wikimedia projects