Karl GSept 30 Demo: Integrated Rule-Oriented Data System (iRODS)
Summary: The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research organizations and government agencies worldwide. This overview and demonstration will cover the "Four Pillars" of iRODS, some use cases seen in the wild, and various open source clients from the iRODS Consortium, RENCI, and others in the community.
Presenter: Terrell Russell, Senior Data Scientist, RENCI
Sept 30 Demo: A Brief Introduction to Jupyter Notebook
Summary: This talk will introduce Project Jupyter with a focus on Jupyter Notebook, a web-based environment for interactive computing. The talk will include a live demo of the basic features of the notebook tool, examples of its use in academia and industry, and a brief look at what's up-and-coming in the project.
Presenter: Peter Parente, Computational Engineer, MaxPoint Interactive, Jupyter Steering Council
Most of big data scholarly literature over 10 years is focused on data mining (9%) and data analytics (7%), while privacy and security are only 2%. See Strang & Sun (2017), Scholarly big data body of knowledge: What is the status of privacy and security? Annals of Data Science, in press.
In 2016, data mining as a publication topic dropped to 1% and social media and cloud computing increased in total percentage.
In 2016, there also was a proliferation of specific topics in big data. There was a high percentage of literature reviews and conceptual frameworks, and limited focus on applications of big data.
In 2016, in the first 6 months of production, research in journals (excluding websites, trade magazines, magazines): 26% focused on literature reviews, 15% focused on taxonomies/frameworks/research design. Total 41%.
Then there was a focus on statistics and machine learning (1%).
Privacy and security studies, however, decreased in pulications with relative production (below 1%) for 2016. It's about half of what it was.
Lea: Is this due to limited NSF funding for privacy & security big data research?
Ken: Ethics of big data is another topic that needs attention.
Ken: Difficult to do big data research in terrorism. Hard to assess accuracy of the infrormation, and sensitivy and layers of security on the data. Need behavioral and predictive research.
Ken: What would be ways we could apply methods to come up with predictive models using big data? Need structure and standards in place for organiations that do have access to the data, like DHS, so can be replicated as a model.
Several forums have identified "Data Science" as an essential element in educating students for the 21st century workforce. Because data literacy at multiple levels is now needed to navigate the deluge of data in almost every scientific disciple and business sector, the United States is suffering from a lack of a trained workforce to meet the current demand and the emerging demands.