The bare minimum Python #1: basic types

My doomed attempt to describe all the Python necessary to begin a career in a data science

Peter Barrett Bryan
6 min readFeb 17, 2023
Photo by Diana Polekhina on Unsplash

Motivation

Lots of folks manage to land a role on a data science or machine learning team with enough Python experience to get through the interview but without enough to contribute effectively.

Obviously, a single series can’t cover all the topics necessary for a successful career… but, to paraphrase Icarus, “here goes nothing.”

Treatment of each topic is going to be incomplete. Ideally, though, the content is valuable to folks from lots of different backgrounds. Some very basic programming concepts are assumed (e.g., what is assignment?), but I hope bits and pieces are valuable to readers at various skill levels.

Python most common “primitive” types

Primitives are the most basic types. They can be combined and associated in more interesting ways as non-primitives (next section!).

- Integers, floats, complex

There are three basic numeric types: integers, floats, and complex. Integers are whole numbers with no fractional component. Floats are floating point numbers, which may include a fractional component.

>>> type(4)
<class 'int'>
>>> type(4.2)
<class 'float'>

Crack open your high school textbooks, scientific notation is handy for exceptionally large and small quantities.

>>> 1.2e-3
0.0012
>>> 1.2e3
1200.0

Complex numbers include a real and imaginary (j) component.

>>> type(4+2j)
<class 'complex'>
>>> (4+2j).real
4.0
>>> (4+2j).imag
2.0

If you aren’t familiar with complex numbers, check out some of my articles about their relevance in math and science.

- Booleans

True and False! Note the capitalization. Nothing complicated there, but things get a little tricky in Python when non-boolean values are cast to booleans.

>>> bool(None)
False
>>> bool("")
False
>>> bool([])
False
>>> bool("Any non-empty string")
True
>>> bool(["Any non-empty list, dict, etc."])
True

We call these values “truthy.” None is truthy False, “” is truthy False, and “Any non-empty string” is truthy True.

Typically, we don’t explicitly cast non-boolean types to booleans like we did above. Instead, we implicitly evaluated truthiness in conditional statements. This will come up again later!

- Strings

Python strings can be enclosed in single or double quotes. They can be joined together with + or implicitly when separated with just spaces.

>>> "Implicitly" ' concatenated ' "strings"
'Implicitly concatenated strings'

Values can be substituted in with “f-strings,” which are cleverly named… they are strings with an “f” in front (“new style”).

>>> def scream_greeting(name: str):
... print(f"HI, MY NAME IS {name.upper()}")
...
>>> name="Peter Bryan"
>>> scream_greeting(name=name)
HI, MY NAME IS PETER BRYAN

You can do some really fancy stuff with f-strings.

  • Variable name and value (nice for debugging!)
>>> variable = 45
>>> print(f"{variable = }")
variable = 45
  • Decimal to n digits
>>> approx_pi = 22/7
>>> print(f"Pi to 2 places {approx_pi:.2f}")
Pi to 2 places 3.14
>>> print(f"Pi to 3 places {approx_pi:.3f}")
Pi to 3 places 3.143
>>> print(f"Pi to 4 places {approx_pi:.4f}")
Pi to 4 places 3.1429
  • Percentages
>>> percentage = 0.85
>>> print(f"{percentage = :.1%}")
percentage = 85.0%
  • Fancy comma formatting
>>> big_number = 100000000000000000
>>> print(f"The big number's value is {big_number:,}")
The big number's value is 100,000,000,000,000,000
  • Convenient date formatting
>>> import datetime
>>> current_utc_time = datetime.datetime.utcnow()

>>> print(f"Standard ugly format: {current_utc_time}")
Standard ugly format: 2022-12-31 02:28:36.406682

>>> print(f"Pretty format: {current_utc_time:%m/%d %H:%M:%S%p}")
Standard ugly format: 12/31 02:28:36AM

For reasons beyond the scope of this article, sometimes (like in logging), you’ll want to use “lazy interpolation” for strings using a percent sign (“old style”).

>>> import logging
>>> name="Peter Bryan"
>>> logging.error("Uh, oh. It's %s.", name)
ERROR:root:Uh, oh. It's Peter Bryan.

Multi-line strings are handy for “docstrings” and other strings that are, well… multiple lines. The use triple quotes.

>>> print("""
... A
... multi-line
... string""")

A
multi-line
string

Python most common “non-primitive” types

Non-primitives combine and associate primitives type into useful data structures. Let’s take a quick tour of common collections.

- Set

Sets are unordered…

>>> {1, 2, 3} == {3, 2, 1}
True

…and contain strictly unique values…

>>> {1, 2, 3, 3, 3, 3}
{1, 2, 3}

As shown, a set is defined with comma-separated values between squiggly braces. They can be really handy when you are trying to find the unique elements in a collection.

- List

Unlike sets, lists are ordered and can contain as many duplicate values as you’d like. In contrast to tuples (up next), they are mutable (i.e., you can modify them after creation).

>>> sample_list = [1, 2, 3, 4]

# You can add new iptems
>>> sample_list.append(5)
>>> print(sample_list)
[1, 2, 3, 4, 5]

# You can remove items by value...
>>> sample_list.remove(4)
>>> print(sample_list)
[1, 2, 3, 5]

# or by position
>>> sample_list.pop(0)
1
>>> print(sample_list)
[2, 3, 5]

As shown, a list is defined with comma-separated values between square braces.

- Tuple

Tuples are very, very similar to lists, but they aren’t mutable. That is, once they are created, you can’t update them. Instead, you have to create a new one.

>>> sample_tuple = (1, 2, 3)

Immutability might not sound like it has any advantages. In reality, though, there are two big ones. First, tuples are more efficient. That difference matters when you are performing lots and lots of operations on lots and lots of tuples. Second, we sometimes genuinely don’t want folks messing with the values in our collections. In those situations, mutability rhymes with liability, and we’d rather use a data structure that won’t allow modification.

As shown, a tuple is defined with comma-separated values between parentheses.

- Dictionary

Dictionaries provide a mapping between “keys” and “values.” When you give a dictionary a valid key, it tells you the associated value.

>>> sample_dict = {"key1": "value1", "key2": "value2"}
>>> print(sample_dict)
{'key1': 'value1', 'key2': 'value2'}

# The length of a dict is the number of key-value pairs
>>> print(len(sample_dict))
2

# You can iterate over keys, values, or both
>>> print(list(sample_dict.keys()))
['key1', 'key2']
>>> print(list(sample_dict.values()))
['value1', 'value2']
>>> print(list(sample_dict.items()))
[('key1', 'value1'), ('key2', 'value2')]

# Unlike tuples, dictionaries are mutable
>>> sample_dict["key3"] = "value3"
>>> sample_dict.pop("key1")
'value1'
>>> print(sample_dict)
{'key2': 'value2', 'key3': 'value3'}

As shown, key-value pairs are indicated by a colon. The list of pairs are comma separated.

- Nested structures

If you’ve used other programming languages, this will come as no surprise: you can nest data structures within other data structures.

You can make a list of lists…

>>> [[1, 2, 3], [4, 5, 6]]
[[1, 2, 3], [4, 5, 6]]

Or a dictionary with tuple keys…

>>> {(1, 2): "One and two", (3, 4): "Three and four"}
{(1, 2): 'One and two', (3, 4): 'Three and four'}p

You get the point! I promise, though, things get more complicated. Let’s try to construct a dictionaries with different types of keys.

>>> {(1, 2): "Tuples are fine as keys in dictionaries"}
{(1, 2): 'Tuples are fine as keys in dictionaries'}

>>> {{1, 2}: "But sets..."}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'

>>> {[1, 2]: "And lists..."}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

>>> {{1: 2}: "And dictionaries are not!"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

For now, we’re going to explain this with a simplifying lie: dictionaries can only use immutable types as keys. Less simply but more truthfully, it is because only “hashable” objects can be used as keys. It will take a bit more reading to fully understand what that means. We’ll get there!

What’s next?

We’ll walk through the basic operators of Python. Thanks for reading! If you’re interested in continued reading, be sure to follow!

--

--

Peter Barrett Bryan

Software engineer with specializations in remote sensing, machine learning applied to computer vision, and project management.