Ahh. What are pmcs?
On Wed, Oct 22, 2014 at 5:06 PM, Maximilian Klein <isalix(a)gmail.com> wrote:
Out of interest, my regex was
pmc\s*\=\s*(.*?)[\|\}]
and then also
pmid\s*\=\s*(.*?)[\|\}]
with ignorecase flag set on.
Make a great day,
Max Klein ‽
http://notconfusing.com/
On Wed, Oct 22, 2014 at 12:48 PM, Aaron Halfaker <aaron.halfaker(a)gmail.com
wrote:
> Hey folks,
>
> Somehow I missed this thread, but I've already addressed this request on
> the Village Pump[1]. See:
>
> See.
>
http://datasets.wikimedia.org/public-datasets/enwiki/etc/pmids.articles.201…
>
>
> I extracted PMIDs with the following regex: /\bpmid *= *[0-9]+\b/i
>
> It includes page_id, page_namespace, page_title, rev_id (most recent),
> pmid in TAB separated values.
>
> Let me know if you have questions or if you think the regex matching
> strategy is insufficient. It's pretty quick to take another pass.
>
> 1.
>
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Extracting…
>
> On Wed, Oct 22, 2014 at 1:27 PM, Maximilian Klein <isalix(a)gmail.com>
> wrote:
>
>> Jake,
>> I have script that does this already for DOIs, Its was one-line change
>> to make. These files should answer what you were looking for.
>>
>>
https://raw.githubusercontent.com/notconfusing/listiness/pmc/pmc_list.txt
>>
>>
https://raw.githubusercontent.com/notconfusing/listiness/pmc/pmid_list.txt
>>
>> In the future you can tell them to use halfak's
>>
https://pythonhosted.org/mediawiki-utilities/
>> This is the code I used to get those lists.
>>
https://github.com/notconfusing/listiness/commit/e140ce9202b9c1098dec40ca1d…
>>
>> Make a great day,
>> Max Klein ‽
http://notconfusing.com/
>>
>> On Mon, Oct 20, 2014 at 9:20 PM, Andrew G. West <west.andrew.g(a)gmail.com
>>
wrote:
>>
>>> Jake,
>>>
>>> Yes, its a rather straightforward parse based on the citation format
>>> which Jeremy described. Doc James and I already have this coded up for a
>>> soon to be published [[WP:MED]] readership/editorship paper.
>>>
>>> Searching for PMID's in the entirety of the Wikipedia article base
>>> would be a bit time consuming -- but if one needs to pull down only
>>> articles in WikiProject Medicine, for example, I am also able to help on
>>> that front.
>>>
>>> Perhaps we'll take this offline, but if anyone else is interested in
>>> the dirty details, feel free to contact one of us off-list. -AW
>>>
>>> --
>>> Andrew G. West, PhD
>>>
http://www.andrew-g-west.com
>>>
>>>
>>>
>>> On 10/20/2014 11:57 PM, Jake Orlowitz wrote:
>>>
>>>> Hi folks,
>>>>
>>>> Relaying a question from a Stanford medical researcher:
>>>>
>>>> "Do you know if it is possible to extract PubMed ID (PMID) or
PMCIDs
>>>> from Wiki references? Furthermore, could you dump those IDs out into a
>>>> list for analysis?"
>>>>
>>>> Best,
>>>> Jake Orlowitz (Ocaasi)
>>>>
>>>>
>>>> _______________________________________________
>>>> Wiki-research-l mailing list
>>>> Wiki-research-l(a)lists.wikimedia.org
>>>>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Wiki-research-l mailing list
>>> Wiki-research-l(a)lists.wikimedia.org
>>>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>>
>>
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l