Jorge Herskovic

The OMICS group strikes again

June 24, 2014 by jorge

The OMICS group is still trolling for scientists to join their high-quality events. The latest is an invitation to Cosmetology 2014, a “Dermatology” conference where all of the speakers are cosmetics and hair salon CEOs. I’m sure I could come speak to them. After all, they probably want to hear me talk about academic metrics, clinical data warehousing, or the applications of graph theory to natural language processing in clinical text.

On second thought, they probably just want me as a demonstration subject for their hair growth workshop. They might actually get me to attend if they offered that…

spammy_conference

Filed Under: Commentary, Research

New spammy tactic by predatory publishers

April 14, 2014 by jorge

The predatory publishers on Beall’s List keep looking for scientists to spam. The newest tactic is a beauty – just connect to people on LinkedIn.

I got the request this morning. I took a screenshot from my phone.

linkedin_spam_predatory_publisher

 

Yes, “OMICS Publishing Group” is featured on Beall’s List. They send me a lot of spam. As you can see by the non-empty “InCommon” field, several of my dear colleagues already fell for this.

Always check Beall’s List before interacting with scientific publishers. In today’s world, it’s sad but necessary.

Filed Under: Commentary, Research

The ZTE Open with Firefox OS

March 4, 2014 by jorge

I bought a phone!

ZTE_Open_box

 

Granted, it’s not the latest and greatest. I even bought it used; it was less than $70 on a certain jungle-themed shopping website.

I’ll be doing some international travel soon, and it’s useful to have an unlocked, multi-band GSM smartphone. This fits the bill, and it gives me an opportunity to play with a different OS.  Firefox OS.

What is Firefox OS? It’s Firefox running on top of the Linux kernel (actually on top of parts of the Android Open Source Project), without a lot of the baggage of Android. A lot of the tooling keeps the Android heritage; for example, you update the OS using Android’s SDK tools.

First things first: please don’t go out and buy one of these unless you know what you’re doing. They are not consumer devices.

The hardware resembles an iPhone 3Gs. 3G-enabled, 256 MB of RAM, single-core ARM processor, 3 MP camera, 320×480 pixels. Full specs here. The screen is, by now, old-school. It’s way, way behind the glass. You can see a gap of at least 1 mm. Compared to a Galaxy S4 or an iPhone 5, it looks vaguely shameful.

The screen is also cheap. Viewing angle = 13.7 degrees to each side, or that’s what it feels like. I didn’t measure it.

The Gap

These would, of course, be terrible things in a flagship phone that retailed for $700. On a phone that costs literally 10% of that, meant for developers and enthusiasts who want to play with a new OS, it’s quite enough. The phone has a bog-standard mini-SIM slot, and a microSD card slot to supplement the meager 512 MB of storage it comes with.

I removed my Printrbot’s microSD card and repurposed it. The phone accepted it without complaints.

Like with any other phone, the first thing I did was check for updates. The phone didn’t report anything, so I assumed it had the latest OS image. About an hour later, I decided to Google “Firefox OS update.” That’s when I discovered that Firefox OS 1.0, the version that came on my phone, wasn’t actually capable of updating itself, so it didn’t report on updates.

See what I mean about “please don’t go out and buy one?”

Updating it was fairly straightforward for anyone who’s rooted an Android phone. In this case, I had to put the new OS image on the microSD card, boot to a special recovery mode, and install the image. Which erases all of the data on the phone. So please don’t go out and buy one yet. An iPhone, this is not.

It was uneventful,  except that it didn’t work the first few times I tried. It took me a while to realize that I mistakenly had downloaded the update from v1.1 to v1.1, instead of the one from v1.0 to v1.1. Which I think is a legitimate mistake, because it didn’t even cross my mind that there would be an update from 1.1 to 1.1. As far as typical version numbering goes, it doesn’t make a lot of sense.

Once 1.1 is actually up and running, you can use “the easy method” to install 1.2. The easy method involves rebooting the phone and, having the Android SDK installed on your computer, downloading the upgrade from ZTE’s Dropbox account and copying the updated Firefox OS piece by piece to the proper area on the phone’s filesystem. It reads something like this:

$ adb reboot bootloader
$ fastboot flash boot boot.img
$ fastboot flash userdata userdata.img
$ fastboot flash system system.img
$ fastboot flash recovery recovery.img
$ fastboot erase cache
$ fastboot reboot

It’s not rocket surgery, but it’s not a task that you want to dump on your typical end user.

Once the update is applied, the OS actually looks quite nice and runs smoothly, even on this decidedly low-end hardware. The browser of choice is, unsurprisingly, Firefox. It looks a tremendous amount like Firefox for Android.

The OS itself is clearly a work in progress, and somewhat rough around the edges. For one thing, this “official” 1.2 distribution comes loaded with test apps like “Membuster” (no, it isn’t a game) and “Template.”

2014-02-26-19-14-32

(also note that, in a very nice touch, the hardware supports an FM radio – something the iPhone still doesn’t have)

This wouldn’t be a big deal if you could delete the test apps. Just like on iOS, tapping and holding lets you delete apps by putting a red “x” in the corner of the icon. Unfortunately there’s a rendering glitch in the delete confirmation screen…

2014-02-26-19-39-38

Yup, the dock renders in front of the confirmation buttons, and doesn’t let you press them. Oh well, who wouldn’t want Test Container on their phone? A reboot seemed to cure this problem, though, and I was able to remove all the test apps just fine.

Some other rough edges are evident in the App Marketplace (i.e. the Firefox OS equivalent of an App Store). For example, apps want to specify storage requirements, but many don’t.

2014-02-26-19-18-26I also haven’t been able to open the notifications drawer. On 1.0, sliding your finger down from the top of the screen did the trick. On 1.2, it doesn’t.

The default keyboard has all of the disadvantages of iOS’ keyboard, and none of the niceties. Swype this is not.

Searching

 

On this screenshot, you can see an interesting feature of the OS: systemwide search that goes beyond the phone itself. Without a Gmail app, the phone was able to “find” Gmail, style it like an app, and present it as a result in the main search pane. When you open it, it launches the mobile web version of Gmail in a full-screen browser automatically. Bam! Instant app.

Of course, every app on Firefox OS is just HTML5 and Javascript, so in practice the difference is mainly between “already installed on the phone and tweaked specifically for Firefox OS” or “retrieved from a server, and may or may not be custom-tailored.” In practice, the difference isn’t much.

All in all, it’s been fun to tinker with a new, obscure phone OS that runs quite well on obsolete hardware.

 

Not all obsolete hardware is the same. It turns out that there are two kinds of ZTE Open. One is sold directly by ZTE though eBay. The others are grey-market ones imported from elsewhere, with locked bootloaders, and incapable of running anything but ZTE firmware. Which lags behind the official Firefox OS quite a bit.

Ask me how I know.

Yup, I downloaded the entire Firefox OS source code and built my own copy. I’ve done this with Android variants like Cyanogenmod before. Although it isn’t for everyone, it’s actually a straightforward process. In this case, it wasn’t: it was impossible to install “straight” Firefox OS on my phone. It just kept rebooting over and over. The only thing that “cured” it was reinstalling ZTE’s 1.2 version full of test apps.

I bought the “wrong” version of the phone and I’m stuck with a bad firmware. Caveat emptor.

Would I recommend this phone to anyone, anywhere? Not unless you want to play with a Firefox OS device for cheap. Still, it’s been a few entertaining hours, and maybe someone will root it properly and open the bootloader up. Until then, I just hope ZTE throws us another update.

 

Filed Under: Tech, Writing

An invitation to become an editor, or perhaps a flower

March 3, 2014 by jorge

I can honestly say that I’ve never thought of myself as a flower.

Perhaps that’s why an invitation to join the editorial board of a journal in “the bouquet of STM JOURNALS” [sic] doesn’t really speak to me.

Or perhaps it’s because STM is on Beall’s list, and their invitation doesn’t quite ring true. It has the same hallmarks as my previous post. To wit,

  1. Ego service: “we are happy to know about you, your rich experience and the areas of interests/ scientific work, that encompass to a big extent. We shall really feel fortunate by having you as an editor; your joining will certainly enhance our expert representation in the esteemed editorial board. “
  2. Please give us content: “We shall also be grateful to you, if you could contribute review article (s). We prefer, review articles commissioned by us i.e. to include the contributions arising from the editorial board/ publication management team- members of the Journal.”
  3. No mention whatsoever of the fact that these are Open Access journals. I’m sure it was just an innocent omission.
  4. A journal with a scope so broad as to be nonsensical (and with misspelled terms, to boot). See below.

Now THAT'S scope

 

Of course, these aren’t my areas of expertise. I’ve never published in any of these. That’s no problem, though: I’m invited to peruse the (no doubt) fragrant bouquet of journals and settle on one, buzzy bee that I am.

Wait, I thought I was a flower.

I’m confused.

 

Filed Under: Commentary, Research, Writing

Microsoft keeps trying to help me

February 27, 2014 by jorge

This is the state of my Lync problem as of today.

Yes, I changed the background. The rest looks the same. Bounce, bounce, login, crash.

I never stopped to think about this before, but if you’re Microsoft, people obviously blame you for everything wrong with their computer. Dodgy hardware crashes Windows? Microsoft sucks! A pirated copy of Windows ME got pwn3d by a script kiddie? Microsoft sucks! Office doesn’t have fancy typography? Microsoft sucks!

It’s no surprise, then, that Microsoft has somewhat insulated itself from its users. Microsoft online support doesn’t have any categories for “my program crashes”; all of the categories are of the sort “I can’t accomplish this task.” In other words, you can find things like “I can’t login to the Exchange server”, or “I can’t place a video call”.

There’s no category for “I can’t even run the program,” because that’s not really supposed to happen. Big software companies devote tons of resources to ensure that their programs are written correctly and tested under most sane (and many, many insane) scenarios. The program crashing on launch is ludicrous, and frankly, almost unbelievable.

Except, obviously, that it is the exact problem I’m having.

When my previous post got retweeted by one of the good people at Ars Technica, it’s no surprise that two Microsoft employees contacted me directly and wanted to help. Let’s call them E and M. One of them went so far as to send me a voucher that would let me use Microsoft’s Pro technical support to get actual professional help with my issue.

So I called the number on the voucher. “Hi, this is Microsoft tech support! Oh? Voucher? I’ve never heard of such a thing. Anyway, I don’t support that product. Let me forward you…”

Long story short, after four calls and communicating with *at least* six different Microsoft Tech Support departments, the following truths were self-evident:

1. Lync for Mac is an obscure product.

2. Lync for Mac is, and is not, part of Office for Mac.

3. You can not get support for Lync for Mac if your workplace licensed it for you. Someone else has to get the support for you.

4. Microsoft support people will tell you they are transferring you and then hang up on you. This happened at least three times.

5. No one at Microsoft knows about the free tech support vouchers.

This wasn’t just me – when M from Microsoft tried to call and pretended to be a customer, they hung up on him, too.

In the end, I wasted about three hours chasing down different support leads. No one was able to provide me with support, because I didn’t have my employer’s Partner ID, and I wasn’t using an Office 365 version of Lync. I actually have an Office 365 license, which I pay for out of my own pocket, for devices at home, but that’s another story. Reflecting back on it, I should’ve just lied.

In the end, an email to E, who contacted me thanks to @Lee_Ars’ original retweet, got traction. He sent the video linked above to someone internally, and that someone sent me a patch-in-testing that fixed the problem.

I can Lync again, for which I’m very grateful. We use it quite extensively with my team at work. It works great. We even bought one of Polycom’s 360 video conferencing systems for Lync (Microsoft used to sell them directly), and it’s fantastic. It can track a speaker all around the room and keep them in focus with good audio. Great stuff.

It took me three days to track down the problem, tweeting, writing a funny and reasonably articulate piece, being able to capture program logs, knowing some of the best tech writers on the planet, getting my frustration amplified, the personal dedication of the wonderful people on Microsoft’s Mac team, and some luck.

Sorry, Microsoft, but as good as some of your products are, that’s just horrible customer service. Your employees are fiercely proud of their company and willing to reach out to an end user to fix a problem. They deserve better from your support organization.

Filed Under: Commentary, First World Problems, Tech

Microsoft tries to @Helps me

February 21, 2014 by jorge

These last few days I’ve had a few unpleasant run-ins with support for different companies. These were not the result of mean people, but of uncaring and somewhat incompetent companies. Today I’ll be mocking the software titan, Microsoft.

I use Microsoft Lync a lot at work. It’s a good IM solution, and used to work fine. Then the latest patch came out (14.0.7). From then on, my Lync “experience” can be seen in the following Youtube video.

Bounce, bounce, login window, crash.

There are plenty of reports of this behavior over the Microsoft forums, but none of the diagnoses seemed applicable to me. So I did what the cool kids do, and contacted @MicrosoftHelps on Twitter. I sent them a link to that same video.

It didn’t start very well, because apparently they can’t look at Youtube. I guess they don’t want Microsoft Tech Support looking at cat videos when they should be confusing customers. Fine, I described the problem and posted a picture of the error.

Lync crash report

They sent me to a support article that seemed applicable. After all, it has a section that says ‘Lync for Mac crashes, and the user receives an “EXC_BAD_ACCESS”‘ error. Which is indeed the kind of error I receive, and I am the user.

The article’s suggested troubleshooting? Collect log files, which means basically “gather evidence”. Not troubleshooting, and not a solution. Ok, I guess I can collect log files anyway. Never mind that you aren’t told what to DO with the collected log files, and as far as I know they don’t submit themselves anywhere.

How do you collect log files for Lync? Why, by checking “Turn on logging for troubleshooting” in the Lync preferences! The ones you can access after you open Lync. Which crashes immediately after opening. Slight problem.

Let’s look around Microsoft’s extensive online knowledge base for clues.

Office for Mac? What Office for Mac?

Office support. Let’s narrow it down… waitamoment. I probably was thinking of that other Office, the one made by the other Microsoft corporation. Because this one’s products don’t include Office 2011, or anything else for the Mac.

Even though they don’t seem to make this product the @MicrosoftHelps folks are always willing to @Helps. They sent me to an entry in a support forum that suggested removing Lync completely and reinstalling it. I had mentioned that I had tried this before, but this didn’t really register in their support script. Whatever. This had happened before, so it could very well happen again.

Removing Lync completely is not for the faint of heart; it includes Terminal commands like

rm ~/Library/Preferences/ByHost/MicrosoftLyncRegistrationDB.648884C7-A874-5125-9557-0AE3BAAE9BF8.plist

and more, which delete files in folders hidden from mortal eyes because Apple would rather you leave them alone, thankyouverymuch.

Not a problem here. I cut my teeth on Slackware Linux when you had to download it to floppies, and 32 bits were new and shiny and the Most Significant 16 bits were unexplored, and your 386 could or could not have a 387. When seeing a mouse cursor on the screen was not guaranteed AT ALL, much less if you used Slackware Linux, because X Windows (which WAS NOT WINDOWS) required a whole ‘nother box of floppies and gobs of hard drive space.

I recently connected a mouse wirelessly to my Android phone, via Bluetooth, and saw a mouse cursor on its 1920×1080 pixel screen. A mouse cursor which I could move around AND CLICK ON THINGS. ON MY PHONE. We yearned for 640×480 pixels back in those days, let me tell you.

Anyway, I am a Highly Skilled Terminal Artisan, so I rm-ved files and sudo-ed, and Trashed Applications and wiped all memory of Lync from my mighty Mac Pro and then installed it again. And lo and behold, it continued to crash, just like the last time I followed Microsoft’s instructions.

So back to twittering, like the Twit I am. This time the nice and patient folks at @MicrosoftHelps pointed me to a tech support site where “Tech Agents” that could @Helps lurk. I went to chat with a Tech Agent.

After listening (readening?) to me describe my problem patiently, the Tech Agent explained (in what inside my head sounds like an apologetic tone) that they don’t “do” Mac. He couldn’t @Helps me after all.

Mac? Like the Big Mac at the place with the clown?

Microsoft Pro support can be reached through a website that suggests that it’ll cost $ 390 per incident. No, thanks. I don’t want their @Helps that bad. @MicrosoftHelps took the knowledge that their tech agents didn’t “do” Mac products in stride, though, and pointed me to the Office support site to look around for more information.

The same Office support site that doesn’t list Office for Mac among its products.

I guess Lync will continue to crash for a while.

 

—-

For the initiated: I don’t know what Lync’s doing, but it’s making one of Apple’s system libraries crash. While trying to send a message. Which is something that happens thousands of times a second in any modern operating system.

This is likely to be a Very Bad Thing, and a result of (at a guess) trying to use unallocated memory or some other atrocity.

Partial trace follows.

Thread 0 crashed:

#  1  0x95d4ba87 in _objc_msgSend + 0x00000017 (libobjc.A.dylib + 0x00005a87)
#  2  0x96409a63 in __CFAutoreleasePoolPop + 0x00000033 (CoreFoundation + 0x00032a63)
#  3  0x9414c43d in -[NSAutoreleasePool drain] + 0x0000007A (Foundation + 0x0005f43d)
#  4  0x94153d92 in __NSAppleEventManagerGenericHandler + 0x000000D1 (Foundation + 0x00066d92)
#  5  0x963ada35 in aeDispatchAppleEvent(AEDesc const*, AEDesc*, unsigned long, unsigned char*) + 0x0000014B (AE + 0x00032a35)
#  6  0x96382fbe in dispatchEventAndSendReply(AEDesc const*, AEDesc*) + 0x0000002C (AE + 0x00007fbe)

Filed Under: Commentary, First World Problems, Tech

You, too, can be a published scientist!

February 12, 2014 by jorge

Straight from my gmail spam folder, the other side of the coin: solicitations for publications.

Phishing Phor Papers

Now that’s an enthusiastic Greetings!!! usually reserved for people promising me miracle drugs. Nigerian barristers tend to be more somber, which is reasonable given that they are usually informing me of the passing of previously unheard of (but very dear, and very illegitimately rich) relatives.

I sure am glad I am one of the “selective scientists” they have chosen! I don’t think they’ll consider this post “excellent work,” but whaddareyagonnado.

Jeffrey Beall maintains a list of what he kindly calls “Potential, possible, or probable predatory scholarly open-access publishers” at Beall’s list. If you are considering submitting work to a journal or publisher you haven’t heard of before, you should use his list as a starting point and do some research on your potential publisher. Unsurprinsingly, today’s correspondents are on his list.

 

Filed Under: Commentary, Research

You, too, can be an editor!

January 14, 2014 by jorge

Part of being a scientist is curating the work of other scientists. This is called peer review. Peer review is critical to the well-being of science, because it helps ensure that the scientific record is important, correct, and has passed some level of validation before being put in front of other people.

Peer review is unpaid, tedious work. Most Universities know that their faculty will spend some time performing peer review. Official institutional CVs and promotion paperwork therefore have space for review activities, so that you can show you’re contributing to the larger scientific community. It also makes you feel like a good citizen, which it should, since no one will ever thank you for it.

A step above this (and a major time sink) is the -still unpaid- position of editor of a scientific journal. A good editor knows his or her field well, has plenty of experience in the peer review trenches, and performs invaluable service to the scientific community.

Being the editor of a scientific journal, or part of the editorial board, brings some academic bragging rights. It is also somewhat expected (implicitly, of course; no one will ever say this) that your friends and colleagues will want to publish their stuff in your journal. Whether it’s because of the expectation of better service, faster reviews, helping a buddy out, or perhaps a less careful look at the flaws of the science in a paper, is not normally explained.

This post was prompted by an email I get every few weeks. The company, journal, and sender are always different, but the content isn’t.

Editorial Spam

Where, oh where should I start?

Let’s begin at the bottom. You’d think that being invited to be the editor of a journal, a big honor, merits a personalized email drafted after (one imagines) much careful deliberation. I’d wager that being invited to edit Science, or Nature, or (a bit further down the ladder) our Journal of the American Medical Informatics Association gets you exactly that. No one has invited me to edit any of those, which is probably for the best. I’d be a terrible editor.

Being invited to edit this journal-that-shall-not-be-named comes via spam, with a telltale line at the end giving away that this particular “eminent personality” (me) was picked from a database, and will be invited to edit again by Mail Merge in the future.

Let’s move on to the actual meat of the problem, though. I’ve never published a paper in this field. Ever. My “immense contribution” to the field is exactly ZERO.

What is this, then? It is, like most spam, about money. This is an Open Access journal (fact carefully omitted from the body of the email), which means that scientists will pay publication fees if their papers pass peer review and get published. I strongly suspect that ANY paper sent to this journal will get published, based on their selectivity choosing editors.

It is, then, an offer of a line for my CV (“Editor for the Journal of IMPRESSIVE_SOUNDING_NAME”) and ego service (“Immense contribution!” “Eminent personalities!”) in return for the possibility of guiding some colleague with a paper that can’t-quite-get-published to an unknown Open Access journal edited by his or her buddy. Any paper will do; note that they strive to serve “biological, medical, and engineering fields”. I suspect that a paper on weather patterns in Mars is a-ok too.

I want to be absolutely clear about this: OPEN ACCESS IS A PHENOMENALLY GOOD IDEA. These clowns are polluting, distorting, and corrupting Open Access in the same way that “Barrister BENJAMIN KOFFI, A Legal Representative to late Mr.B A Herskovic” devalues email for everyone else.

Not that it matters, though. As far as I can tell from its website, this journal has never published an article. Nothing in the “current issue”. Nothing in “past issues”. “Articles in press”? Empty. It’s not included in the Journal Citation Reports or Scopus. Not that it should, given that it hasn’t published squat, but I checked for completeness’ sake.

How many editors does this journal that, so far, publishes nothing has on its editorial board? According to its website, twenty-one. It sure seems like a lot of people to edit zero articles, but it’s the only way to get people to consider the journal and get those sweet, juicy Open Access fees flowing.

Ick.

 

Filed Under: Commentary, Research, Writing

False-Positive Psychology

January 10, 2014 by jorge

Fantastic post over at Slate Star Codex on how people use statistics to cheat at science. I already gave my take on the pressures scientists face, and the culture that leads to it.

I wish I had written that post, or its linked articles. Go read.

Filed Under: Commentary, Research

Complexity in MEDLINE – part 1

January 3, 2014 by jorge

This is my first not-a-paper publication. I wrote it in a much more conversational style, which I greatly prefer.

After Randy Sheckman’s statements on journals such as Nature, Science, etc. I started wondering if there was a way to quantify what kinds of articles these journals publish. After all, the general perception (notwithstanding the allegations of inflated “brand value”) is that these journals publish high-quality science. I personally believe this is true, i.e., Nature and company do publish high-quality peer-reviewed science.

The trap that a lot of people fall into is believing that something not published in high-impact-factor (HIF) journals is of lower quality. I don’t believe this, so I wanted to test a different idea. My working hypothesis: that HIFs publish science that is of broad interest, so as to have a broad-enough readership to sustain their business model which relies on being very highly-cited (i.e. popular).

To test whether HIFs are “broad” we need two things: to define and measure “broadness” and to define “HIF”. We’ll start with the latter.

High Impact Factors

Some journals are simply widely known and almost universally recognized as “highly cited”. In medicine, those are the New England Journal of Medicine. Lancet. The British Medical Journal. JAMA. Nature. Science. Cell.

Household brand names, really, if you are in academic medicine. Their 2012 Impact Factors as computed by the Journal Citation Reports hover between 30 and 50. The Impact Factor, by the way, is the average number of citations an article published in the year n-1 and n-2 receives from articles published in year n. As an average, it’s subject to all kinds of problems. Suffice it to say that CA: A Cancer Journal for Clinicians, had a 2012 Impact Factor of 153… three times the “measly” 51 of the venerable New England Journal of Medicine. Such is the power of averages; a few very highly cited articles in a journal that publishes few articles, and you end up with a small journal with a higher IF than NEJM.

Here is the histogram of citations received, since publication (ergo, a number higher than the one used to compute the IF) for articles published in 2011 in CA: etc.

Citations_received_CA_histogram

Fine, the first bar covers a pretty broad range that might justify the IF. Let’s try again, binning the data into 20 equally-spaced bins.

Cites_2011_CA_20_bins

Convinced yet? The use of impact factors is madness, yet for some reason people’s careers depend on it. One article, on Global Cancer Statistics (very handy reference material) got 2000+ citations in 2012. Of course that will be highly cited, and it’s great, useful work. But it drags all the other 0-citation articles (I’m not claiming those are bad, mind you!) up into the HIF stratosphere.

Regardless, Impact Factors aren’t going anywhere any time soon. So let’s use a rule of thumb to define a HIF journal. Talk to any scientist at a large academic medical center – most will complain that the “best,” flagship journal in their field (however they choose to define it) has an IF of around 3. This is true for Biomedical Informatics, of course. On the other hand, commonly-known HIF journals have IFs of 30. I’ll use these two as a heuristic and declare that a Low Impact Factor (LIF) journal has an IF ≤ 4. A Medium Impact Factor (MIF) journal has an IF >4 and ≤ 10. And > 10 is a HIF journal.

Broadness

My working hypothesis here is that HIFs cover topics that are of broader interest than LIFs. Under this hypothesis, LIFs are highly technical, thus having smaller readerships who -in turn- have small readerships, getting less citations.

So how general is the topic of a scientific article? MEDLINE can help. Every article in MEDLINE is tagged with MeSH headings: concepts from the Medical Subject Headings vocabulary assigned by a professional, highly-trained medical librarian describing what topics are covered in an article. There’s actually two kinds of MeSH headings for a MEDLINE article: Major Topics and non-Major Topics. Major Topics are what an article is about; non-Major topics are important things mentioned in an article.

 
        <MeshHeadingList>
            <MeshHeading>
                <DescriptorName MajorTopicYN="N">Algorithms</DescriptorName>
            </MeshHeading>
            <MeshHeading>
                <DescriptorName MajorTopicYN="N">Information Storage and Retrieval</DescriptorName>
                <QualifierName MajorTopicYN="Y">statistics &amp; numerical data</QualifierName>
            </MeshHeading>
            <MeshHeading>
                <DescriptorName MajorTopicYN="N">Internet</DescriptorName>
            </MeshHeading>
            <MeshHeading>
                <DescriptorName MajorTopicYN="N">Medical Subject Headings</DescriptorName>
                <QualifierName MajorTopicYN="N">utilization</QualifierName>
            </MeshHeading>
            <MeshHeading>
                <DescriptorName MajorTopicYN="N">PubMed</DescriptorName>
                <QualifierName MajorTopicYN="Y">statistics &amp; numerical data</QualifierName>
            </MeshHeading>
        </MeshHeadingList>

 

This is an example from the XML output of the record for one of my own articles. The nice thing about MeSH is that it is a thesaurus (in the Information Retrieval sense) – things have parents. For example, the MeSH term for Medical Subject Headings themselves is classified as

Information Science [L01]
   Information Services [L01.453]
      Documentation [L01.453.245]
         Vocabulary, Controlled [L01.453.245.945]
            Subject Headings [L01.453.245.945.700]
Arrow pointing to current tree node Medical Subject Headings [L01.453.245.945.700.500]

(you can look these up in the MeSH Browser). The relationships are IS_A, which means that Medical Subject Headings IS_A kind of Subject Headings, which IS_A kind of Vocabulary, Controlled, which IS_A kind of Documentation, which in turn IS_A kind of Information Service which itself IS_A Information Science. You may or may not agree with the exact arrangement in this taxonomy; for my purposes, I just want you to note that Medical Subject Headings is a very detailed concept: it’s five levels under Information Science in the MeSH tree.

You also need to know that the MeSH indexers are required to use the most detailed term they can. This means that the article didn’t just mention any Subject Headings: it mentioned something more specific. So an article like mine above was more specific than one that just dealt with Subject Headings, because it dealt with Medical Subject Headings.

You can probably see where this is going. We can use the depth of the terms assigned to an article as a measure of its generality or lack thereof – of its “broadness.”

MeSH terms can be in multiple trees at once. In order to simplify things a little, we’ll choose the deepest term position possible, effectively assuming that all concepts are at their most specific. In other words, when an article is tagged with the MeSH concept “eye” we’ll assume Body Regions->Head->Face->Eye for a depth of 3 (counting from 0) instead of Sense Organs->Eye for a depth of 1. This is arbitrary, but if we can still find a difference in depth despite assuming that everything is as specific as possible, it should be a robust result.

For our purposes, we’ll focus on Descriptors, which are the actual unmodified topics, and we’ll only take Major topics – things articles are about. We’ll collect them for all articles in every journal.  

Methods

You’ll need a copy of PubMed/MEDLINE in XML format for this one. Don’t worry, it’s “just” 13 GB. Hard drives are cheap. You’ll also need a computer with 8 GB of RAM at a minimum, unless you want to edit my code and make it more efficient. It’s probably easier to just buy more RAM. You know you want to, anyway.

You’ll then need to install xmltodict. How to install Python packages is beyond the scope of this article, but if

pip install xmltodict

or

easy_install xmltodict

don’t help you, you’ll have to do some Googling. Then you’ll want to take MEDLINE and turn it into a Journal name->MeSH Term->count dictionary. Who wouldn’t? Grab a copy of read_medline.py and build_journal_term_database.py. Put them on the aforementioned computer with lots of RAM (and many cores help too, by the way), and run

python build_journal_term_database.py <PATH_TO_MEDLINE_BASELINE_HERE>

You can use pypy instead of python for a speed boost, at the cost of Even More Memory. When the process ends, there’ll be a file called journal_major_mesh_terms.pickle in the same directory. It’s only 131 MB on my machine.

In part 2, we’ll compute some statistics on this dictionary.

Filed Under: Research

« Previous Page
Next Page »

Categories

Commentary COVID First World Problems Research Scientometrics Tech Uncategorized Writing

Suscribe via email

Copyright © 2021 · Agency Pro Theme on Genesis Framework · WordPress · Log in