home learn tableau about

Where are the Datacentric Jobs?

The inspiration for this visualization was the report titled "The Importance of Data Occupations in the U.S. Economy", released by the Economics & Statistics Administration agency (United States Department of Commerce) in early 2015. A pdf of that report can be downloaded from here. As per the Executive Summary:

In this report we identify occupations where data analysis and processing are central to the work performed and measure the size of employment and earnings in these occupations, as well as in the industries that have the highest concentration of these data occupations.

One of the Key Findings:

Data intensive industries are located in many states, but the highest concentrations are in Washington, D.C.; Virginia; Massachusetts; Maryland; and Connecticut.

But the geographical aspects of that analysis struck me as too broad, where one might wonder if San Francisco (and/or San Jose area, and/or XYZ) is perhaps lost in the mass of their enclosing states. Also, any analysis in which Washington D.C. is treated the same as the entire state of New York as opposed to New York City will tend to be a bit misleading. Note that the visualization only considers a metropolitan area to be a candidate for ranking if there are at least 5 data occupations present once the Data Importance filter has been applied.

Another factor I wanted to lend more weight to was the datacentricity of various occupations. The ESA report defines “data occupations” by:

focusing on jobs where the use of data is “very important” as identified by O*Net, a comprehensive system of job descriptions developed with the support of the U.S. Department of Labor.

which presumably winds up giving equal weight to both 'data important' occupations such as Chief Executives and those that are 'all about the data', e.g. Astronomers. The slide bar below, underneath 'DEGREE OF DATACENTRICNESS + LOG(DATACENTRICTUDE)' provides a method for giving greater (or less) emphasis to the data importance value (Avg Data Importance in the table below) assigned to various occupations by the O*Net database. Note that the numbers behind this method are arbitrary to say the least. Something to do with a (per MSA) calculation in the form of SUM(JOBS_1000 * AvgDataImportance) where the former is a number supplied by the U.S. Bureau of Labor Statistics. The original ESA report used BLS data from May 2013 but this viz uses more recent numbers from May 2014.