An Update on Where I’m At Mentally/Emotionally:
Returning to the Data Science stream of my Personal Training Plan, I need to get set up with various tools and online accounts so I can start to go deeper into my studies. As I mentioned when I started this stream, my initial thoughts were of trepidation at signing up and entering into this new world – I felt intimidated by it all. So much so that I did a quick 90 degree detour into Business Analytics where I feel much more comfortable, being on familiar ground.
But I’m also reminded of how motivated I feel to change my career trajectory, and my confidence that this is possible to do, even now in my late 40’s. And I’ve been given some great opportunities by the company I work at to move in a more systems-and data-oriented direction. I feel very lucky about that. So I’m grabbing that opportunity with both hands. Coupled with my self-led learning (which is, of course, heavily supported by various MOOC’s provided by so many excellent educational institutions) I am sure I can make this work out in the best possible way for me.
A Nostalgic Interlude:
I saw on LinkedIn yesterday a link to a video of teenagers being shown Windows 95 on an old PC. A real walk down memory lane! Made me laugh when one of them commented: “Whoah DOS? Imagine having to type in code to actually get your computer to do stuff for you!” And I was thinking, it’s pretty cool learning Python programming basics like how to go to the Windows Command Prompt, type in some code and get the computer to do something I wanted it to do. I now feel like I’m ahead of the curve already. And I realise I’m not so out of place here in this new world after all – albeit I’m still very much at the beginning of this major adventure.
Learning Some New Tools:
Working on Windows 8.1, the Leek/JHU course requires me to use Git for version control and Git Bash as the Command Line Interface (CLI). The CLI will allow me to easily create, edit and navigate around folders, and also create, edit and run programs. Downloading Git (latest 64-bit version 2.7.2 of Git for Windows, default installation) also installs the Git Bash CLI.
I also created a remote repository on GitHub, a web-based platform for software development including social aspects of following, sharing and collaborating with others, making it a great environment for learning coding from others more advanced in the skill. The course recommends I develop this public profile as my studies continue as a portfolio of my work as a data scientist. As such I didn’t set this up as an anonymous profile; you can find me on GitHub here, although there’s not a lot to see there yet. Played around with creating and forking a repo (from ncarchedi); still need to read up the recommended documentation (create a new repo), (fork a repo) and (repo basics).
Markdown is a text file formatted to allow reading by programming languages such as R, or developer environments like RStudio or GitHub. For example, create ReadMe documentation files for a program being developed which can be included in that program’s repository. The easiest way to create a markdown file is in a text editor but saved with .md extension. Further editing can be done in Notepad++. Simple text formatting commands include preceding some heading text with two or three pound signs (##) or (###) to create a second-level or third-level heading respectively. An unordered (bullet point) list can be created by preceding each list item with an asterix (*).
Help and documentation for markdowns can be found here. For RMarkdown (to use with RStudio, setting up soon) see help documentation at rstudio.com, or press the MD button in RStudio for a quick guide.
I’ve already downloaded R programming language from CRAN (Comprehensive R Archive Network) which provides the basic R programming language only and comes with basic functionality for summarising data, plotting data and so on. One of the powerful benefits of using R is there are a whole host of different free, open-access pre-written functions (called Packages) available from the wider R user/developer community, to extend what you can do with the language. Packages come for all kinds of things, like cleaning data, analysing data, advanced plotting, making interactive apps, and so on.
Most R Packages are available on CRAN and can be viewed in R using the available.packages() command. Or click on the Task Views link on the CRAN website to view all packages sorted by functionality/task.
R Packages can be easily installed either from R directly using the install.packages() command, or within RStudio using the dropdown menu. After installing a package it needs to be loaded to the index using the library() function to make it accessible and usable.
RTools is also needed if you want to begin building R Packages in a Windows environment, for example if you begin developing data analysis programs. It can be downloaded from CRAN. (I downloaded Rtools33 for R version 3.2.x and installed into c:\Rtools with default settings.)
References
Jeff Leek/JHU: Data Science specialisation by Jeff Leek, Johns Hopkins University on Coursera