• Skip to primary navigation
  • Skip to main content

DebKR

To the Stars

  • About
  • Blog
  • Contact

Data Analytics Projects

D3.js and Data Visualisation

D3.js and Data Visualisation

11/07/2016 By debkr

data-visualisationData analysis process:
When we encountered the data analysis process earlier in the year, we saw the basic process consists of: gather; clean; analyse (including, checking for accuracy); and finally, visualise/present. We’ve been doing lots of Python programming coupled with creating SQL databases to extract data from some source (web pages, files, XML or JSON files) and sort or store it in a database.

The process we’ve been using during the capstone course – and in line with the original Page/Brin search engine process – is to first collect the raw data and store it – unprocessed – into a holding database. From here we’ve gone on to clean up the data and save it in a more structured way in a new, relational database. This results in a smaller database which is quicker to search and retrieve data from. As I found when writing my own search engine application, these first two databases take a long time to retrieve the data, especially when the search engine’s reach is set widely. [Read more…] about D3.js and Data Visualisation

Filed Under: Blog, Data Analytics, Data Analytics Projects, Personalised Training Plan, Programming, Programming Projects, Web Data Tagged With: coding101

Data Visualisation: Network Graphs (work in progress)

Data Visualisation: Network Graphs (work in progress)

20/03/2016 By debkr

networks-and-nodesThe following is inspired by (and based loosely on) a tutorial by data journalist Clara Guibourg: Network analysis of a Twitter hashtag using Gephi and NodeXL (hat-tip @KirkDBorne), worked up in a ‘Heath Robinson’ fashion since I don’t currently have Java or MS Excel installed on this laptop. (Java v.7+ is required to run Gephi, “the leading visualization and exploration software for all kinds of graphs and networks”; and, without Excel, not much use in trying to run the NodeXL Excel 2007+ template “that makes it easy to explore network graphs”.)

Graphs don’t just come in curves:
A standard Cartesian graph consists of a set of (x,y) co-ordinates (the points or vertices on the graph) and the relationship (the edges, arcs or lines) between them. The result is the graphed line, which may be also expressed as some algebraic function specifying the relationship (for example, in its simplest form: y = x). [Read more…] about Data Visualisation: Network Graphs (work in progress)

Filed Under: Blog, Data Analytics, Data Analytics Projects, Data Science, Digital Business Systems, Personalised Training Plan

Copyright © 2016–2025 · Powered by WordPress On Genesis Framework · Log in

  • Writing
  • Developing
  • Consulting