Le Grunewald, Computer Science, Univeristy of Oklahoma
Should the South Big Data Hub be a data commons?
Are Hub members comfortable building upon existing NSF-sponsored cyberinfrastructure and other efforts tha have had this level of investment?
What should the common cyberinfrastructure capabilities be?
Should a common infrastructure be made available to all Hub spoke proposals? To Hub membership writ large?
Are there other similar environments people are aware of?
Identify software stack for which demos are needed as we think of the vision for the Hub and what components are needed?
What are the core software components?
When can we schedule demos for Hub members?
Notes (ALL- Please add your thoughts here):
Mike Conway - interested in web standards, authentication model, moving data, providing different ways of computation, how can we develop an architecture to allow different tools and services to plug in,
Reagan -- What are people's requirements? Are extensions needed?
Christine -- NDS, UT Austin/TACC, UC San Diego/SDSC, UI Urbana-Champaign/NCSA, etc Carefully and deliberating working towards a truly national federation:
Lea SKenton -- Built an app store like resource for data managementtools/services, provisioned with resources, development tools to foster NDS goals to have these things interoperate with each other. Provide an infrastructure for scientists:
Victor HVictor Hazlewood discussed at a high level the information services available from XSEDE.
There is an extensive information services capability available and maintained by XSEDE. This information service is built on Glue2 and the information service repositories have REST APIs to all the information. This is available at https://info.xsede.org/and the "Primary APIs and browsing interfaces" is a good place to start to look at what is available. This currently has the 10 "allocated" XSEDE Service Provider resources at 5 centers ("allocated" means NSF funded resources) as participants in the information services. The XSEDE Federation (https://www.xsede.org/web/sp-forum/spf-members) has 26 "unallocated" Service Providers (SP) that are members of the XSEDE Federation and these SPs are being asked to enter a resource they want to publish in our Resource Description Repository and list service/software information in the XSEDE information services. Victor will put together a future Hub presentation to describe in more detail the XSEDE information service (and give this presentation to all the Hubs as possible). This information service is used to provide information to the XSEDE portal (https://portal.xsede.org/to display such things as Resource lists (https://www.xsede.org/resources/overview), software (https://portal.xsede.org/software#/) job/load information, and other information for the benefit of XSEDE users and stakeholders. There is a requirements collection capability in XSEDE whereby national CI stakeholders can send requirements to XSEDE to investigate the initiation of an an XSEDE engineering activity to add a capability, service or software, including augmenting the XSEDE information services perhaps for the benefit of the Hubs.
Lea SThree sets of NSF-sponsored infrastructure discussed:
NDS infrastructure - workflows
XSEDE infrastructure - computer resources
DataNet Federation infrastructure - research collaborations
Plus private sector infrastructure (Data.World, Microsoft, etc)
Mike C(MC) We are also investigating the Agave system at TACC and don't know yet how it sits with the rest of the components - http://agaveapi.co/ that's more oriented to HPC
Lea SThe discovery environment is interacting with the XSEDE environment, and through the ODUM institute have interacted with NDS. So it is possible interoperate across these three environments and we would like to pursue those.
One way to go forward is to develop a distributed test bed. Demonstrate the interoperability, distributed use of resources, or distributed use of
Victor can provide what information services that are available through a presentation.
Reagan - Identify explicit services beyond compute and storage, like geospatial services.
Demos to Schedule for the Hubs:
CyVerse (Nirav, Mike Conway)
NDS Labs (Christine)
Meetings: preference is once a month for larger group discussion, 2x or more for core working team
3 PM EST may work on Fridays for monthly meeting.
Next week to discuss draft list of those core Hub components, evaluate opportunities for federation and interoperability, and would like to get comments back..
Victor Hazlewood <email@example.com> COO Joint Institute for Computational Sciences (Joint between UTK and ORNL) & XSEDE Deputy Director of Operations & Service Provider Coordinator
The South Big Data Hub Community Engagement, Diversity and Partnerships Working Group has initially identified use cases as an essential way to educate and help new entrants to data science or HPC navigate the decisions surrounding which cyber-infrastructure tools are appropriate for a given project. The working group seeks to initially collect tangible use cases from different domains, forms of cyber-infrastructure, and software tools.
In addition, the South Hub is facilitating community discussions around several verticals, including data science in: Health, Energy, Materials and Manufacturing, Habitat Planning, Smart and connected communities, Coastal Hazards, and Privacy/Security, Policy and Economic Modeling. These discussions are being seeded by the Hub, through initial open conference calls, as a means of networking and information dissemination. As members of the South Hub community, if you wish to help organize or lead these calls, or have an interest in establishing a regular working group as part of the South Hub, please contact the Hub team at firstname.lastname@example.org.