WWWalker's Data Integration

See also Skills | Portals | Groupware | Finance Management | Hadoop | Hive | Dell Boomi | Map Mashups | Health

Data Mining

Big Data

In June 2021, I used Amazon Web Services aws command line tools to migrate files betweens S3 buckets to setup backup for a client.

In June 2012 and January 2017, I used aws on Amazon Web Services to move a large email repository from one Linux server to another across the Cloud.

In 31/8/2014 at Microsoft Brisbane, I used Queensland Open Data and Microsoft Azure HDInsight to write scripts for Hadoop and Machine Learning (not Hive yet) to process big data for business intelligence once large semantic Web and ecommerce projects start to materialise, perfect in the post-recession climate to claw-back from overseas raiders and poor economic conditions.

Many traditional bricks and mortar retailers in Australia like Harvey Norman are starting to put together online stores to compete with US and European competitors who are much cheaper due to economies of scale.

On 12/12/13, I viewed this webinar on how big data is used in Google, LinkedIn and Foursquare by data scientists to create impressive results and insights for end-users.

Open Data Mashups

Since 2011, we have planned to work with Queensland Government open data initiatives in transport mashups.

In March 2013, Dwight Walker attended an open data seminar by the Queensland Government in which all public domain data will be released as static datasets for integration into 3rd party Web applications to increase economic development in Queensland.

There is a big opportunity to do 3rd party data mining on Queensland Health open data to improve systems using the cloud.

Hospital GPs only record 25% of patient data electronically so they are the biggest blockage to ehealth in Australia.

Data Cleaning

Importing Bulk Data into SQL

We have information management and software engineering qualifications so can use these to perform data integration effectively and efficiently.

We have worked on large data cleaning projects:

We optimised the importation of the 1 million records from 58 hours to 2 hours by data cleaning using awk and bash scripts before importing into MySQL. We also ran batch imports using PHP.

ETL (Extract, Transform, Load) Tools

We use various UNIX and other tools:

and opensource tools like transxchange2gtfs.

Modifying Bulk PDFs

We have modified 80 PDFs by rotating them 90 degrees left.

Ecommerce Synchronisation

eBay Seller Consulting

We can remotely setup eBay auction items for sellers:

This is a huge exercise in logistics and supply chain management!


Database/Office Integration

We plan to integrate MySQL with OpenOffice using the new connector so the customer can dig into online databases in their Word or Excel document.

Contact Us

Contact us if you need your data mined, integrated, imported, cleaned, backed up or restored.

Created: 11 Nov 2010 18:18
Last Updated: 28 Jul 2022 22:16

WWWalker Intro