Information Management Concepts
Preface
Prerequisites
Learning ethics
ASCII
What are Information management concepts?
Why do Information management concepts matter to you?
Research
Ecosystem
Standards, jobs, industry, roles, ā¦
Story
FAQ
Worked examples
Chapter Name
Subsection
Subsubsection
Exercises
- Logic. Mathematics. Code. Automatic Verification such as Lean Proven or Frama-C.
- Languages in Anki.
Projects
Summary
FAQ
Reference Notes
Information theory Overview
0.13 Number systems
0.130 Positional number systems
Assume ļ»æ and ļ»æ
Convert from ļ»æ-base to decimal. If ļ»æ then
Convert from decimal to ļ»æ-base.
def a(number, k):
return math.trunc(number / math.pow(10, k)) % 10
def length(number):
return math.trunc(math.log10(number)) + 1
def from_base_to_decimal(number, b):
acc = 0
for k in range(length(number)):
acc +=def from_decimal_to_base(number, base):
return acc
def from_decimal_to_base(number, base):
base_number = ""
q = number
while q > 0:
qk = trunc(q/base)
ak = q - base*qk
base_number = str(ak)+base_number
q = qk
return base_number
function from_base_to_decimal(number, base) {
return String(number).
split('').
map((n)=> parseInt(n, base)).
reduce((acc, current, index, array) => {
console.log(`${current}+${base}*${acc}=${current+base*acc}`);
return current+base*acc;
});
}
function from_decimal_to_base(number, base) {
if (base <= 1) {
return;
}
base_number = ""
q = number
while (q>0)
{
qk = Math.trunc(q/base)
ak = q - base*qk
console.log(`${base * qk + ak}=${base}*${qk}+${ak}`);
base_number = ak.toString(base).toUpperCase()+base_number
q = qk
}
return base_number
}
function from_any_base_to_any_base(number, start_base, end_base) {
console.log(`from base ${start_base} to decimal`);
const start_number = from_base_to_decimal(number, start_base);
console.log(`from decimal to base ${end_base}`);
return from_decimal_to_base(start_number, end_base);
}
Fast algorithm.
http://www.opentextbookstore.com/mathinsociety/2.4/HistoricalCounting.pdf
https://www.gcu.ac.uk/media/gcalwebv2/gcuoutreach/NUMBERS & NUMBER SYSTEMS.pdf
https://www.radford.edu/~wacase/Number Systems Unit Math 116.pdf
https://www.cl.cam.ac.uk/teaching/1415/CompFund/NumberSystemsAnnotated.pdf
https://en.wikipedia.org/wiki/Numeral_system
https://www.cs.princeton.edu/courses/archive/spr15/cos217/lectures/03_NumberSystems.pdf
http://www.unitconversion.org/unit_converter/numbers-ex.html
COMS W3827 Fundamentals of Computer Systems
0.1311 Binary and boolean function
Binary vs boolean and bits
Ecosystem
0.132 Nonpositional Number Systems
0.14 Data Storage
0.140 Data Types
0.141 Storing Numbers
Method of complements
Nine's complement
Ten's complement
One's complement
One's complement is an operation to invert bits.
Worked examples.
ļ»æ
Two's complement
There is only one zero in twoās complement notation.
One's complement + 1. Because ļ»æ, but ļ»æ
Worked examples.
ļ»æ
Trick.
https://www.csestack.org/how-to-find-2s-complement/
Example. ļ»æ two's complement format to integer.
How to encode negative numbers in binary number systems?
Gray code
Base ā2
8ā4ā2ā1 code is also called BCD (Binary coded Decimal)
Sign and magnitude
Offset binary, also called excess-K or biased representation
Excess-8 (biased)
Zig-zag encoding
Excess-3, 3-excess or 10-excess-3 binary code (often abbreviated as XS-3, 3XS or X3), shifted binary or Stibitz code. https://en.wikipedia.org/wiki/Excess-3
Complements
- Ones' complement
Two's complement
Two's complement is the easiest to implement in hardware, which may be the ultimate reason for its widespread popularity. Choo, Hunsoo; Muhammad, K.; Roy, K. (February 2003). "Two's complement computation sharing multiplier and its applications to high performance DFE". IEEE Transactions on Signal Processing. 51 (2): 458ā469. doi:10.1109/TSP.2002.806984.
Summary
Contents of memory | Unsigned | Sign-and-magnitude | Two's complement | One's complement |
---|---|---|---|---|
0000 | 0 | 0 | +0 | 0 |
0001 | 1 | 1 | +1 | 1 |
0010 | 2 | 2 | +2 | 2 |
0011 | 3 | 3 | +3 | 3 |
0100 | 4 | 4 | +4 | 4 |
0101 | 5 | 5 | +5 | 5 |
0110 | 6 | 6 | +6 | 6 |
0111 | 7 | 7 | +7 | 7 |
1000 | 8 | -0 | -8 | -7 |
1001 | 9 | -1 | -7 | -6 |
1010 | 10 | -2 | -6 | -5 |
1011 | 11 | -3 | -5 | -4 |
1100 | 12 | -4 | -4 | -3 |
1101 | 13 | -5 | -3 | -2 |
1110 | 14 | -6 | -2 | -1 |
1111 | 15 | -7 | -1 | -0 |
Notes | The leftmost bit defines the sign. If is 0, the integer is positive else negative. | The leftmost bit defines the sign. If is 0, the integer is positive else negative. | The leftmost bit defines the sign. If is 0, the integer is positive else negative. This has two 0. |
How to encode real numbers in binary number systems?
Fixed-point
Floating-point
IEEE 754 format
three parts: a sign, a shifter, and a fixed-point number.
0.142 Storing Text
A character is an element of grammar (English, Spanish, ...) + accepted human-computer interface by convention (backspace, delete, escape, @, ...), i.e. a code. ļ»æ
For example, English grammar is ļ»æ and human-computer interface in ASCCI isļ»æ.
We can represent each character with a bit pattern of n bits (bit pattern length=n). If we have a ļ»æ, his cardinality ļ»æ. Then computer understands ļ»æ and ļ»æ
ļ»æ .
ļ»æ.
ļ»æ.
Therefore, ļ»æ.
0.1421 ASCII Code and UNICODE
IEEE milestones.
ASCII.
UNICODE. Emojis.
printf "\r12345\n\r6\n"; printf "\r5\n"
printf "\r12345"; printf "\r5\n"
https://stackoverflow.com/questions/3091524/what-are-carriage-return-linefeed-and-form-feed
Hex Code.
How works internally?
https://www.w3schools.com/tags/ref_urlencode.ASP
https://onlineunicodetools.com/convert-unicode-to-hex
https://en.wikipedia.org/wiki/List_of_Unicode_characters
Mackenzie, Charles E. (1980). Coded Character Sets, History and Development (PDF). The Systems Programming Series (1 ed.). Addison-Wesley Publishing Company, Inc. pp. 6, 66, 211, 215, 217, 220, 223, 228, 236ā238, 243ā245, 247ā253, 423, 425ā428, 435ā439. ISBN 978-0-201-14460-4. LCCN 77-90165. Archived (PDF) from the original on May 26, 2016. Retrieved August 25, 2019.
https://pjb.com.au/comp/diacritics.html
ASA standard X3.4-1963
https://stackoverflow.com/questions/1761051/difference-between-n-and-r
0.143 Storing Audio
0.144 Storing Images
0.145 Storing Videos
0.15 Operations on Data
0.150 Logic Operations
0.151 Shift Operations
0.152 Arithmetic Operations
Sum of naturals.
Rules. 0+0=0, 0+1=1, 1+0=1 and 1+1=10
Binary Addition Algorithm. (2015, July 06). Retrieved from https://chortle.ccsu.edu/assemblytutorial/zAppendixE/binaryAdd.html
Subtraction of naturals (No complements).
Rules. 0-0=0, 1-0=1, 1-1=0, 0-1=10-1=1
https://www.calculator.net/binary-calculator.html
Subtraction of naturals (Two's complement)
ļ»æ
https://www.exploringbinary.com/twos-complement-converter/
Why does char - '0' successfully convert a char to int in C?
https://www.quora.com/Why-does-char-0-successfully-convert-a-char-to-int-in-C/answer/Greg-Kemnitz
Nine's complement
Pascaline.
0.153 Bitwise operations in C
Bitwise Operators in C and C++
x & 1
Ā is equivalent toĀ x % 2
(no sign).
if x&1 is true, then x is an odd number.
First of all, an example:
5(00000101)& 1(00000001)
00000101 &
00000001
00000001 (1 True)
Informal Proof. For non-complement binary number. Where ļ»æ is position left to right.
std::list<int> v = { 1, 2, 3, 4, 5, 6 };
auto it = v.begin();
while (it != v.end())
{
// remove odd numbers.
if (*it & 1)
{
// `erase()` invalidates the iterator, use returned iterator
it = v.erase(it);
}
// Notice that the iterator is incremented only on the else part (why?)
else {
++it;
}
}
x >> 1
Ā is equivalent toĀ x / 2
https://www.cprogramming.com/tutorial/bitwise_operators.html
0.154 Boolean Algebra and Digital Logic: Arithmetic Logic Unit
Summary
Worked examples
Which of the following decimal numbers has an exact representation in binary notation? A) 0.1 B) 0.2 C) 0.3 D) 0.4 E) 0.5
Floating numbers
IEEE 754
File
A file is a finite-length stream of bytes that we can append bytes to the file up to the limits of the host storage system and read from any location in the file or write updates to any location. A higher layer up to file is the file system.
In Unix, everything is a file, so what is a directory? a file, what is a process? a file, what is a driver? a file. If you donāt separate how looks and how does it, you have blobs.
Blobs and plain text.
Blobs tracking in Git, databases, spreadsheets
Blobs and byte arrays
Binary-to-text encoding
Split a file into chunks
Images
How can you see an image? and process it? OpenCV
PACS
PACS (picture archiving and communication system)
DICOM
Raster images
png, jpg, webp
Vector images
SVG and canvas
High-performance images and animations
WebGL, OpenGL, directx, vulkan
Computer graphics
QR Code
By field
Myers Automatic Booth
(LAS, Geology)
(G-code, Manufacturing)
(EDF, Neuroscience)
(FASTA, Genetics)
GIS (Geographic Information System)
Financial Information eXchange (FIX)
FITS (Astronomy format)
HDF5
Hadoop
Avro
Delimited
JSON
Parquet
XML, JSON, YAML, TOML (Information Technology)
https://www.npmjs.com/package/serialize-javascript
https://docs.python.org/3/library/pickle.html
(PDF, Standard)
- PyPDF2: Useful for extracting text and metadata from PDFs.
- PDFMiner: More advanced than PyPDF2, better for more complicated layouts.
- Tabula-py: Ideal for extracting tables from PDFs.
- PyMuPDF (FitZ): Good for dealing with both text and images.
pikepdf
Poppler Utils:
PDFtk
References
RPA
https://github.com/abhi18av/awesome-pdf
Developing with PDF by Leonard Rosenthol
PDF Explained by John Whitington
PDFtk
https://www.youtube.com/watch?v=KmP7pbcAl-8
https://www.youtube.com/watch?v=48tFB_sjHgY&ab_channel=Computerphile
SWIFT
DICOM Standard
DICOM networking protocol
Latex
Video astronomy
File as Programming
CSS, JS, HTML, WASM
Database as FIle
Internet media type (MIMES)
ASN.1, PEM and DER
Video
The filmmakerās handbook
ffmpeg
Barcode image
https://github.com/zxing/zxing
1D product | 1D industrial | 2D |
UPC-A | Code 39 | QR Code |
UPC-E | Code 93 | Data Matrix |
EAN-8 | Code 128 | Aztec |
EAN-13 | Codabar | PDF 417 |
UPC/EAN Extension 2/5 | ITF | MaxiCode |
RSS-14 | ||
RSS-Expanded |
Hex editing
We tell you that files are binary, right? But, how we can you show that if you donāt feel it? And yes, you could write your programs in hex ā āsince C and assembly are too high-levelā youāre thinking and yes you could do reverse engineering to software.
In Unix, you have different options to edit and show files as hex. Some choices are xxd
, hexdump
[2], and so on [4]. Weāre going to use xxd
and vim
.
Write a plain text in Unix with āHello Worldā and vim. Change to mode xxd.
But, how can show a āHello Worldā in binary? xxd -b
Writing a binary file
Carriage and CRLF
Code pages, character encoding, UTF-8 and BOM
https://www.youtube.com/watch?v=jeIBNn5Y5fI&list=PL0M0zPgJ3HSesuPIObeUVQNbKqlw5U2Vr&index=6
https://www.tecmint.com/convert-files-to-utf-8-encoding-in-linux/
file -i [file]
iconv options -f from-encoding -t to-encoding inputfile(s) -o outputfile
Unicode. UTF-8. Emojis.
https://www.php.net/manual/en/ref.mbstring.php
Charset
mb_detect_encoding and mb_convert_encoding
Content-Type in http
application/json;charset=ISO-8859-1
json_decode($response, false, JSON_UNESCAPED_UNICODE);
https://encoding.spec.whatwg.org/
Tipography, font-families and font-faces
References
[1] hexdump(1) - Linux manual page. (2022, December 19). Retrieved from https://man7.org/linux/man-pages/man1/hexdump.1.html
[2] Explains, F. S. (2021, December 24). Writing binary files: a tutorial in C and Python (security@cambridge screencast). Youtube. Retrieved from https://www.youtube.com/watch?v=jAu0oyxsP20&list=PLbyW0t9gkXg2lcU_T4LgZryxrcJBdP_iF&index=6&ab_channel=FrankStajanoExplains
[3] Hex dump. (2023, January 14). Retrieved from https://vim.fandom.com/wiki/Hex_dump
[4] Top Hex Editors for Linux. (2021, May 17). Retrieved from https://www.tecmint.com/best-hex-editors-for-linux
Data type size
Extended ASCII
https://www.gnu.org/software/emacs/manual/html_node/emacs/User-Input.html
Threads
Overflow
Automation and insights
Drawing.
Ofimatics. Spreadsheet programming.
Spreadsheets Are An Awesome Functional Programming Tool | Lucas Reis' Blog. (2018, December 29). Retrieved from https://lucasmreis.github.io/blog/spreadsheets-are-an-awesome-functional-programming-tool
Conference, S. L. (2014, September 21). "Spreadsheets for developers" by Felienne Hermans. Youtube. Retrieved from https://www.youtube.com/watch?v=0CKru5d4GPk&ab_channel=StrangeLoopConference
Editor
Keyboard remapping
Exercises
- Logic. Mathematics. Code. Automatic Verification such as Lean Proven or Frama-C.
- Languages in Anki.
Projects
Is standard?
Is it capable to render HTML, JS?
Summary
FAQ
Reference Notes
References
http://neerc.ifmo.ru/wiki/index.php
https://tug.org/texshowcase/cheat.pdf
Explains, F. S. (2021, March 19). Touch typing: why you MUST learn it ASAP. Youtube. Retrieved from https://www.youtube.com/watch?v=0FUsiyiWWw8&list=PLbyW0t9gkXg2OjpTqGig1nWCqzxwrF8h5&ab_channel=FrankStajanoExplains
Next steps
https://www.youtube.com/watch?v=jeIBNn5Y5fI&list=PL0M0zPgJ3HSesuPIObeUVQNbKqlw5U2Vr&index=6
Problems
From LeetCode, HackerRank, ...
Given a signed 32-bit integerĀ
x
, returnĀx
Ā with its digits reversed. If reversingĀx
Ā causes the value to go outside the signed 32-bit integer rangeĀ[-231, 231Ā - 1]
, then returnĀ0
.https://math.stackexchange.com/questions/480068/how-to-reverse-digits-of-an-integer-mathematically
Assume the environment does not allow you to store 64-bit integers (signed or unsigned).
- Multiply
- Divide
- Length