CSE 8803 / ME 8883 - Materials Informatics Course - Fall 2016

20 Aug 2016

High-dimensional PES metamodel via DFT - Blog 1: Data acquisition (Group:WangTran) - Report 13Sep2016

CSE8803 / ME 8883 Materials Informatics

Fall 2016 Georgia Tech

Anh Tran and Zhiyu Wang

It has been a few weeks when we started this project. We have achieved a few milestones, and we would like to share what we have so far on our blog.

First, we have set up a mass simulation on PACE cluster (a high-performance computer at Gatech). By randomly sampling the input parameters, and retaining inputs/outputs, we now have collected almost 15,000 data points, for each inputs and their corresponding outputs. We have a data miner shell script *.sh that work relatively well to extract and concatenate numeric inputs and outputs.

In term of data structure, the group is attempting to incorporate and build a database with HDF5 for parallel I/O purpose. The HDF5 package has been downloaded and tested with simple codes in C++. With respect to the database of inputs/outputs in log.inputs.txt/log.output.txt, perhaps it is not too hard to do so.

In term of research, many helpful ideas have come, particularly with Andrew Medford’s suggestions. We are looking for a way to build very big data with neural network learning. The idea is very promising, but there are still challenges to come.

