• Skip to primary navigation
  • Skip to main content

DebKR

To the Stars

  • About
  • Blog
  • Contact

variable

Building a Tagging Engine in Python using Dictionaries

Building a Tagging Engine in Python using Dictionaries

01/05/2016 By debkr

tagging-engineI started playing around building a Tagging Engine in Python using Lists but now I’ve studied a bit more – particularly Dictionaries – I want to see how I can perfect what I was working on. Here are a couple of key things I added/changed in this program over the version I was working on earlier.

1. Using a dictionary instead of lists for faster counting and simpler recall. This is straight forward, using the get method as taught by Dr. Chuck (see Coding 101 part 7 for more details).

words = dict()
for word in wordlist :
____if word in excluded : continue
____words[word] = words.get(word,0) + 1

[Read more…] about Building a Tagging Engine in Python using Dictionaries

Filed Under: Blog, Personalised Training Plan, Programming, Programming Projects Tagged With: ==, count, descending, dictionary, iteration, len(), length, list, loop, order, print, program, range, raw_input, user, variable, working directory

Coding 101 (part 7)

Coding 101 (part 7)

29/04/2016 By debkr

coding-101-dictionariesLists work great but they leave something on the table:
I’ve been building a Tagging Engine in Python as a little exercise to help me learn by doing, using my knowledge so far. It became clear pretty quickly that I needed a better way to handle pairs of data. In this case I was looking at a list of words and the number of times each of them appeared in a text, so that I could rank the most common words by order of significance (frequency). If I just used one list and appended both the word and its count to the list, one value after the other, there was no way I could sort by count number.

I got round this problem by having two lists, one for the words and another for the word counts. I could then manipulate the data as needed. This did work fine in the simple program I wrote, but it was my usual unwieldy, sledgehammer approach again. I knew there was a way I could handle that pair of data points better – using Python’s Dictionaries functionality – but I didn’t want to rush ahead of the curve. Well now I get the chance to learn all about dictionaries. [Read more…] about Coding 101 (part 7)

Filed Under: Blog, Personalised Training Plan, Programming Tagged With: coding101, count, data, database, dict(), dictionary, function, items(), key/value pair, list, order, python, return, value, variable, word counts

Coding 101  (fun with lists)

Coding 101 (fun with lists)

18/04/2016 By debkr

coding-101-fun-with-listsThis is me just mucking about with lists, testing out what I’ve learnt so far and applying it to little problems I might want to solve. I find it the best way to learn, and it’s more fun than reading books!

Project 1: Building a tagging engine (Mon 18Apr16)

1. This snippet splits each line into a list, creates an iteration variable to loop through all words in the line list and print them out. I add various print statements at suitable points (both variable print statements and descriptive text statements) to help me test the program structure, to make sure it’s doing what I want and expect it to at each point through the loop. [Read more…] about Coding 101 (fun with lists)

Filed Under: Blog, Personalised Training Plan, Programming, Programming Projects Tagged With: add, append(), coding101, continue, count, file, find, frequency, iteration, len(), line, list, loop, phrase, print, program, range, split(), test, text, variable

Coding 101 (part 6)

Coding 101 (part 6)

16/04/2016 By debkr

spaghettiWhen strings become spaghetti:
Working with strings and files, particular when using the for {line} in {filehandle}: construct, allows us to do some cool manipulation of data, by finding, splitting and stripping the data into different chunks based on some repeating factor (such as a comma spearating each value in order), then sorting, counting and totalling those values through iterative loops. [Read more…] about Coding 101 (part 6)

Filed Under: Blog, Personalised Training Plan, Programming Tagged With: append(), category, code, coding101, data, element, find, function, handle, index, iteration, key/value pair, len(), length, list, loop, numeric, position, python, range, repeating, return, sorting, split(), startswith(), string, text, value, variable

Coding 101 (part 4)

Coding 101 (part 4)

10/04/2016 By debkr

This post follows on from earlier posts (Coding 101 (part 1) ~ (part 2) ~ (part 3)) and records my responses and learnings from the highly-recommended Python programming book and Coursera specialisation by Charles Severance (see References below).

A quick recap on strings:
Strings are computer-speak for characters, specifically where some object or value has the ‘type’ string. Type is an attribute Python applies to any given object or value so it knows how to handle that object or value, i.e. what kinds of operations can and cannot be applied to it. String, and two numeric types – integer and float – are the most common types within Python.

A string may contain one or more characters, so ‘a’ and ‘0’ are strings, just as ‘abcdefghij’ and ‘Hello world. I am Python.’ are. [Read more…] about Coding 101 (part 4)

Filed Under: Blog, Personalised Training Plan, Programming Tagged With: coding101, data, email, find, function, index, integer, length, loop, name, numeric, position, python, read, return, startswith(), string, strip, type, value, variable

Coding 101 (part 3)

Coding 101 (part 3)

04/04/2016 By debkr

This post follows on from my earlier posts Coding 101 (part 1) and (part 2), and is my responses and learnings from the highly-recommended Python programming book and course by Charles Severance (see References below).

Functions:
Functions are sections of code (a sequence of executable steps) which we want to be able to use and re-use at many points in our program. It may be that we want to read and process a whole range of data over and over (but the process done to all the data is the same) or maybe there are a number of inputs required from the user which all need to be processed the same way. Rather than rewriting the same lines of code again and again in our program, we can give that section of code a name (known as ‘defining the function’). We can then ‘call’ that named function, that is, ask Python to execute the defined sequence of steps, at any future point within our program, and as many times as we want. (In other programming languages this same functionality may be referred to as sub-programs or sub-routines.) [Read more…] about Coding 101 (part 3)

Filed Under: Blog, Personalised Training Plan, Programming Tagged With: ==, argument, break, code, coding101, condition, construct, continue, data, define, function, input, items(), iteration, largest, list, loop, parameter, program, python, raw_input, reserved words, return, sequence, smallest, string, type, value, variable

Coding 101 (part 2)

Coding 101 (part 2)

21/03/2016 By debkr

coding-101This post follows on from my earlier post Coding 101, and is my responses and learnings from the highly-recommended Python programming book and course by Charles Severance (see References below).

Jargon:
I’m working on a glossary here, but still very unstructured and massively incomplete so I suggest staying with Google for now .

Some necessary concepts:
Programming consists of sentences or statements, which may include reserved words, which tell Python what we want it to do, but will also include some values in both numerical and text formats. These take the form of either constants, whose values don’t change, and variables, [Read more…] about Coding 101 (part 2)

Filed Under: Blog, Personalised Training Plan, Programming Tagged With: code, coding101, command line, condition, conditional, elif, except, execute, floating, function, indentation, input, integer, line, print, program, programming, python, read, return, statement, string, style, text, true, try, try .. except, type, value, variable

Next Page »

Copyright © 2016–2025 · Powered by WordPress On Genesis Framework · Log in

  • Writing
  • Developing
  • Consulting