Инд. авторы: Kalinichenkoi L., Stupniko S., Fazliev A., Gordov E., Okladnikov I., Kiselyova N., Kovaleva D., Malkov O., Podkolodny N., Ponomareva N., Pozanenko A., Volnova A.
Заглавие: New data access challenges for data intensive research in russia
Библ. ссылка: Kalinichenkoi L., Stupniko S., Fazliev A., Gordov E., Okladnikov I., Kiselyova N., Kovaleva D., Malkov O., Podkolodny N., Ponomareva N., Pozanenko A., Volnova A. New data access challenges for data intensive research in russia // CEUR Workshop Proceedings. - 2015. - Vol.1536. - P.215-237. - ISSN 1613-0073.
Внешние системы: РИНЦ: 27146775;
Реферат: eng: The goal of this survey is to analyze the global trends for development of massive data collections and related infrastructures in the world aimed at the evaluation of the opportunities for the shared usage of such collections during research, decision making and problem solving in various data intensive domains (DIDs) in Russia. The representative set of DIDs selected for the survey includes astronomy, genomics and proteomics, neuroscience (human brain investigation), materials science and Earth sciences. For each of such DID the strategic initiatives (or large projects) in USA and Europe aimed at creation of big data collections and the respective infrastructures planned up to 2025 are briefly overviewed. The IT projects aimed at the development of the infrastructures supporting access to and analysis of such data collections are also briefly overviewed. The paper concludes with an idea of organizing in Russia of a target interdisciplinary program for the development of the pilot project of the distributed infrastructure and platform for the access to various kinds of data in the world, storage of data and their analysis during research in various DIDs. As a part of such infrastructure, the program should also include development of the high performance interdisciplinary center for data intensiveapplications support in various DIDs. This survey is intended also to serve as a basis for the panel discussion at the International Conference DAMDID/RCDL'2015.
Издано: 2015
Физ. характеристика: с.215-237
Цитирование: 1. LIGO Scientific Collaboration, Virgo Collaboration: J. Aasi et al. Prospects for Localization of Gravitational Wave Transients by the Advanced LIGO and Advanced Virgo Observatories. LIGO-P1200087, VIR-0288A-12, 2013. - http://arxiv.org/abs/1304.0670. 2. C. P. Ahn, R. Alexandroff, C. A. Prieto et al. The Tenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the SDSSIII Apache Point Observatory Galactic Evolution Experiment. The Astrophysical Journal Supplement, 211(2), 2014. DOI:10.1088/0067-0049/211/2/17. 3. ASM Alloy Phase Diagram Database. - http://www.asminternational.org/asmenterprise/apd/help/About.aspx. 4. Asia Materials Data Committee. - http://amdc.org/index.html. 5. S. Barthelmy. GCN and VOEvent: A status report, Astronomische Nachrichten, 329(3), 2008, p. 340- 342. 6. A. N. Belikov, F. Dijkstra, J. A. Gankema et al. Information systems playground - The target infrastructure. Scaling Astro-WISE into the petabyte range. Experimental Astronomy, 35(1-2), 2011, p. 367-389. 7. BD2K centers. - https://datascience.nih.gov/bd2k/fundedprograms/centers. 8. G. V. Belov, V. S. Iorish, V. S. Yungman. IVTANTHERMO for Windows - database on thermodynamic properties and related software. CALPHAD, 23(2), 1999, p. 173-180. 9. V. Bennett, P. Kershaw, M. Pritchard et al. EO science from big geo data on the JASMIN-CEMS infrastructure. Proc. of the Conference on Big Data from Space BiDS'14. European Space Agency-ESRIN, 2014. 10. B. Frezouls, P.-M. Brunet. Big data technology in the service of the Gaia data processing. Proc. of the Conference on Big Data from Space BiDS'14. European Space Agency-ESRIN, 2014. 11. H. Binder, L. Hopp, K. Lembcke, H. Wirth. Personalized disease phenotypes from massive OMICs data. Big Data Analytics in Bioinformatics and Healthcare. IGI Global, 2015. 12. BRAIN 2025: A Scientific Vision. - http://braininitiative.nih.gov/2025/BRAIN2025.pdf. 13. Copernicus. Observing the Earth. - http://www.esa.int/Our-Activities/Observing-the-Earth/Copernicus/Overview3. 14. Copernicus Space Component Data Access Portfolio: Data Warehouse 2014 - 2020. Prepared by B. Hoersch, V. Amans. Reference COPEPMAN- EOPG-TN-15-0004. 2015. 15. Data Warehouse Requirements V2.0 - Copernicus Data Access Specifications of the space-based Earth Observation needs for the period 2014-2020. 16. B. D. Dusenbery, Z. Onder, D. Locke, K. Blairl, D. Kural. Petabyte-Scale Cancer Genomics in the Cloud. TCGA Symposium 2015 Poster Presentation. 17. DBs of NIST. - http://www.nist.gov/chemistryportal.cfm. 18. M. L. Dubernet, V. Boudon, J. L. Culhane et al. Virtual atomic and molecular data centre. Journal of Quantitative Spectroscopy and Radiative Transfer, 111(15), 2010, p. 2151-2159. 19. E. Gangler. Big data challenge posed by the Large Synoptic Survey Telescope. Proc. of the 2014 conference on Big Data from Space BiDS'14. European Space Agency-ESRIN, 2014. 20. EUDAT - European Data project. - http://www.eudat.eu/. 21. Fact Sheet. 2014. - https://www.whitehouse.gov/sites/default/files/microsites/ostp/brain-fact-sheet-9-30-2014-final.pdf. 22. S. W. Fleming, F. Abney, T. Donaldson et al. Beyond The Prime Directive: The MAST Discovery Portal and High Level Science Products. American Astronomical Society, AAS Meeting #225, #336.59, 2015. 23. N. Fourniol, J. Lockhart, D. Suchar et al. News from ESO Archive Services: Next Generation Request Handler and Data Access Delegation. Proceedings of Astronomical Data Analysis Software and Systems XXI Conference. ASP Conference Series, V. 461. Edited by P. Ballester, D. Egret, and N.P.F. Lorente. San Francisco: Astronomical Society of the Pacific, 2012, p.669. 24. Genome 10K community of scientists. Genome 10K: A Proposal to Obtain Whole-Genome Sequence for 10 000 Vertebrate Species. Journal of Heredity, 100(6), 2009, p. 659-67. 25. C. S. Greene, J. Tan, M. Ung, J. H. Moore, C. Cheng. Big Data Bioinformatics. Journal of Cellular Physiology, 229(12), 2014, p. 1896-1900. 26. D. Gomez-Cabrero, I. Abugessaisa, D. Maier, A. Teschendorff, M. Merkenschlager, A. Gisel, E. Ballestar, E. Bongcam-Rudloff, A. Conesa, J. Tegnér. Data integration in the era of omics: current and future challenges. BMC Systems Biology, 8(2:I1), 2014. 27. N. S. Kardashev, V. V. Khartov, RadioAstron collaboration. Radio Astron - A Telescope with a Size of 300 000 km: Main Parameters and First Observational Results. Astronomy Reports, 57, 2013, p. 153-194. 28. M. J. Hawrylycz, E. S. Lein, A. L. Guillozet- Bongaarts et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature, 489, 2012, p. 391-399. 29. M. Herland, T. M. Khoshgoftaar, R. Wald. A review of data mining using big data in health informatics. Journal of Big Data, 1(2), 2014. 30. Tony Hey, S. Tansley, K. Tolle. The Fourth Paradigm - Data Intensive Scientific Discovery. 2009. - http://goo.gl/edvr6W. 31. Human Brain Project. -https://www.humanbrainproject.eu. 32. Human Connectome Project. WU-Minn HCP 500 Subjects Data Release: Reference Manual. 2014. 33. International Centre for Diffraction Data. - http://www.icdd.com/. 34. N. T. Issa, S. W. Byers, S. Dakshanamurthy. Big data: The next frontier for innovation in therapeutics and healthcare. Expert Rev. Clin. Pharmacol. 7(3), 2014, p. 293-298. 35. M. Juric, T. Tyson. LSST Data Management: Entering the Era of Petascale Optical Astronomy. Highlights of Astronomy, 16, 2015. p. 675. 36. D. B. K. Kamesh, V. Neelima, R. R. Priya. A Review of Data Mining using Bigdata in Health Informatics. International Journal of Scientific and Research Publications. 5(3), 2015. 37. I. Khabibullin, S. Sazonov, R. Sunyaev. SRG/eROSITA prospects for the detection of GRB afterglows. Monthly Notices of the Royal Astronomical Society, 426(3), 2013, pp. 1819- 1828. 38. N. N. Kiselyova, V. A. Dudarev, V. S. Zemskov. Computer information resources in inorganic chemistry and materials science. Russ. Chem. Rev. 79(2), 2010, p. 145-166. 39. D. Kumar, R. Kumar. Impact of Biological Big Data in Bioinformatics. International Journal of Computer Applications, 101(11), 2014. 40. J. W. Lichtman, H. Pfister, N. Shavit. The big data challenges of connectomics. Nature Neuroscience, 17, 2014, p. 1448-1454. 41. LSST and Technology Innovation. - http://www.lsst.org/lsst/about/technology. 42. The Materials Data Facility, http://www.nationaldataservice.org/mdf/. 43. Materials Genome Initiative for Global Competitiveness. 2011. - http://www.whitehouse.gov/sites/default/files/microsites/ostp/materials-genome-initiative-final.pdf. 44. Materials Genome Initiative Strategic Plan. 2014. http://www.whitehouse.gov/sites/default/files/microsites/ostp/NSTC/mgi-strategic-plan-dec-2014.pdf. 45. Materials Science International GmbH. - http://www.matport.com/phase-diagramcenter/buy-online/purchase/selectElements. 46. Materials Science Portal. - http://www.nist.gov/materials-science-portal.cfm. 47. C. A. Mattmann. Next generation cyberinfrastructure to support comparison of satellite observations with climate models. Proc. of the Conference on Big Data from Space BiDS'14. ESA - ESRIN, 2014. 48. J. M. Mazzarella, P. M. Ogle, D. Fadda et al. Explosive Growth and Advancement of the NASA/IPAC Extragalactic Database (NED). American Astronomical Society, AAS Meeting #223, #302.04. 2014. 49. National Data Service (NDS). - http://www.nationaldataservice.org/. 50. NEMO ontologies. - http://purl.bioontology.org/ontology/NEMO. 51. Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC). - http://www.nitrc.org/include/about-us.php. 52. NIMS Materials Database. - http://mits.nims.go.jp/db-top-eng.htm. 53. Number of entries using search query 'database'. neuinfo.org. NIF. Retrieved 25 Jan 2015. - https://neuinfo.org/mynif/search.php?q=database&first=true&t=indexable&nif=nlx-144509-1. 54. OASIS. - http://www.oasis-brains.org/. 55. E. Perret, T. Boch, F. Bonnarel et al. Working Together at CDS: The Symbiosis Between Astronomers, Documentalists, and IT Specialists. Proc. of the Open Science at the Frontiers of Librarianship Conference. ASP Conference Series, Vol. 492. San Francisco: Astronomical Society of the Pacific, 2015, p. 13. 56. Phases Database. - http://phases.imet-db.ru. 57. H. K. Ramapriyan, J. Behnke, E. Sofinowski, D. Lowe, M. A. Esfandiari. Evolution of the Earth Observing System (EOS) Data and Information System (EOSDIS). In Standard-Based Data and Information Systems for Earth Observation: Lecture Notes in Geoinformation and Cartography, Liping Di and H.K. Ramapriyan, Eds. Springer: Berlin-Heidelberg, 2010. 58. G. Rixon, M.-L. Dubernet, N. Piskunov et al. VAMDC - The Virtual Atomic and Molecular Data Centre - A New Way to Disseminate Atomic and Molecular Data - VAMDC Level 1 Release. Journal of Physics: Conference Series, 1344, 2011, p. 107-115. 59. Scientific Group Thermodata Europe. - http://www.met.kth.se/sgte/. 60. J. L. Schnase, D. Q. Duffy, M. A. McInerney et al. Climate Analytic as a Service. Proc. of the Conference on Big Data from Space BiDS'14. ESA - ESRIN, 2014. 61. B. M. Shustov, A. I. Gomez de Castro, M. Sachkov et al. WSO-UV progress and expectations. Astrophysics and Space Science, 354(1), 2014, p. 155-161. 62. Springer Materials. - http://materials.springer.com/. 63. STN. - http://www.stn-international.de. 64. A. R. Taylor. Data Intensive Radio Astronomy en route to the SKA: The Rise of Big Radio Data. Highlights of Astronomy, 16, 2015, p. 677. 65. P. de Teodoro, A. Hutton, B. Frezouls et al. Data Management at Gaia Data Processing Centers. Astrostatistics and Data Mining, Springer Series in Astrostatistics, V. 2. Springer Science+Business Media New York, 2012. 66. The Copernicus Space Component: Sentinels Data Products List. ESA, Copernicus Space Component Ground Segment team. Reference COPE-GSEGEOPG- PD-14-0017. 2014. 67. The LSST-French Connection: Signed and Tweeted! - http://www.lsst.org/News/enews/frenchconnection-201504.html. 68. Towards a 10-year vision for global research data infrastructures. GRDI2020 Final Roadmap Report. 2012, 108 p. - http://www.grdi2020.eu. 69. User Categories. - https://spacedata.copernicus.eu/web/cscda/copernicus-users/user-categories. 70. Versailles Project on Advanced Materials and Standards (VAMAS). - http://www.vamas.org/. 71. Why neuroinformatics? International Neuroinformatics Coordinating Facility. - http://www.incf.org/about/why-neuroinformatics. 72. O. Zhelenkova, V. Vitkovsky, T. Plyaskina. Electronic archive of observational data of astrophysical observatory., Digital Libraries, 13(4), 2010. - http://www.elbib.ru/index.phtml?page=elbib/rus/journal/2010/part4/ZVP. 73. N. S. Kardashev, I. D. Novikov, V. N. Lukash et al. Review of scientific topics for the Millimetron space observatory. Physics-Uspekhi, 57(12), 2014, p. 1199-1228.