punkshell_module_punk::lib(0) 0.1.1 doc "punk library"

Name

punkshell_module_punk::lib - punk general utility functions

Table Of Contents
Synopsis
Description
Overview
- Concepts
- dependencies
API
Internal
- Namespace punk::lib::system
Keywords
Copyright

Description

This is a set of utility functions that are commonly used across punk modules or are just considered to be general-purpose functions.

The base set includes string and math functions but has no specific theme

Overview

overview of punk::lib

Concepts

The punk::lib modules should have no strong dependencies other than Tcl

Dependendencies that only affect display or additional functionality may be included - but should fail gracefully if not present, and only when a function is called that uses one of these soft dependencies.

This requirement for no strong dependencies, means that many utility functions that might otherwise seem worthy of inclusion here are not present.

dependencies

packages used by punk::lib

Tcl 8.6-

API

Namespace punk::lib::class

class definitions

Namespace punk::lib::compat

compatibility functions for features that may not be available in earlier Tcl versions

These are generally 'forward compatibility' functions ie allowing earlier versions to use later features/idioms by using a Tcl-only version of a missing builtin.

Such Tcl-only versions will inevitably be less performant - perhaps significantly so.

lremove list ?index ...?: Forwards compatible lremove for versions 8.6 or less to support equivalent 8.7 lremove
lpop listvar ?index?: Forwards compatible lpop for versions 8.6 or less to support equivalent 8.7 lpop

Namespace punk::lib

Core API functions for punk::lib

lindex_resolve list index

Resolve an index which may be of the forms accepted by Tcl list commands such as end-2 or 2+2 to the actual integer index for the supplied list

Users may define procs which accept a list index and wish to accept the forms understood by Tcl.

This means the proc may be called with something like $x+2 end-$y etc

Sometimes the actual integer index is desired.

We want to resolve the index used, without passing arbitrary expressions into the 'expr' function - which could have security risks.

lindex_resolve will parse the index expression and return -1 if the supplied index expression is out of bounds for the supplied list.

Otherwise it will return an integer corresponding to the position in the list.

Like Tcl list commands - it will produce an error if the form of the

K x y

The K-combinator function - returns the first argument, x and discards y

see https://wiki.tcl-lang.org/page/K

It is used in cases where command-substitution at the calling-point performs some desired effect.

is_utf8_multibyteprefix str

Returns a boolean if str is potentially a prefix for a multibyte utf-8 character

ie - tests if it is possible that appending more data will result in a utf-8 codepoint

Will return false for an already complete utf-8 codepoint

It is assumed the incomplete sequence is at the beginning of the bytes argument

Suitable input for this might be from the unreturned tail portion of get_utf8_leading $testbytes

e.g using: set head [get_utf8_leading $testbytes] ; set tail [string range $testbytes [string length $head] end]

is_utf8_single 1234bytes

Tests input of 1,2,3 or 4 bytes and responds with a boolean indicating if it is a valid utf-8 character (codepoint)

get_utf8_leading rawbytes

return the leading portion of rawbytes that is a valid utf8 sequence.

This will stop at the point at which the bytes can't be interpreted as a complete utf-8 codepoint

e.g It will not return the first byte or 2 of a 3-byte utf-8 character if the last byte is missing, and will return only the valid utf-8 string from before the first byte of the incomplete character.

It will also only return the prefix before any bytes that cannot be part of a utf-8 sequence at all.

Note that while this will return valid utf8 - it has no knowledge of grapheme clusters or diacritics

This means if it is being used to process bytes split at some arbitrary point - the trailing data that isn't returned could be part of a grapheme cluster that belongs with the last character of the leading string already returned

The utf-8 BOM \xEF\xBB\xBF is a valid UTF8 3-byte sequence and so can also be returned as part of the leading utf8 bytes

hex2dec ?option value...? list_largeHex

Convert a list of (possibly large) unprefixed hex strings to their decimal values

hex2dec accepts and ignores internal underscores in the same manner as Tcl 8.7+ numbers e.g hex2dec FF_FF returns 65535

Leading and trailing underscores are ignored as a matter of implementation convenience - but this shouldn't be relied upon.

Leading or trailing whitespace in each list member is allowed e.g hex2dec " F" returns 15

Internal whitespace e.g "F F" is not permitted - but a completely empty element "" is allowed and will return 0

dex2hex ?option value...? list_decimals

Convert a list of decimal integers to a list of hex values

-width <int> can be used to make each hex value at least int characters wide, with leading zeroes.

-case upper|lower determines the case of the hex letters in the output

log2 x

log base2 of x

This uses a 'live' proc body - the divisor for the change of base is computed once at definition time

(courtesy of RS https://wiki.tcl-lang.org/page/Additional+math+functions)

logbase b x

log base b of x

This function uses expr's natural log and the change of base division.

This means for example that we can get results like: logbase 10 1000 = 2.9999999999999996

Use expr's log10() function or tcl::mathfunc::log10 for base 10

factors x

Return a sorted list of the positive factors of x where x > 0

For x = 0 we return only 0 and 1 as technically any number divides zero and there are an infinite number of factors. (including zero itself in this context)*

This is a simple brute-force implementation that iterates all numbers below the square root of x to check the factors

Because the implementation is so simple - the performance is very reasonable for numbers below at least a few 10's of millions

See tcllib math::numtheory::factors for a more complex implementation - which seems to be slower for 'small' numbers

Comparisons were done with some numbers below 17 digits long

For seriously big numbers - this simple algorithm would no doubt be outperformed by more complex algorithms.

The numtheory library stores some data about primes etc with each call - so may become faster when being used on more numbers but has the disadvantage of being slower for 'small' numbers and using more memory.

If the largest factor below x is needed - the greatestOddFactorBelow and GreatestFactorBelow functions are a faster way to get there than computing the whole list, even for small values of x

* Taking x=0; Notion of x being divisible by integer y being: There exists an integer p such that x = py

In other mathematical contexts zero may be considered not to divide anything.

oddFactors x

Return a list of odd integer factors of x, sorted in ascending order

greatestFactorBelow x

Return the largest factor of x excluding itself

factor functions can be useful for console layout calculations

See Tcllib math::numtheory for more extensive implementations

greatestOddFactorBelow x

Return the largest odd integer factor of x excluding x itself

greatestOddFactor x

Return the largest odd integer factor of x

For an odd value of x - this will always return x

gcd n m

Return the greatest common divisor of m and n

Straight from Lars Hellström's math::numtheory library in Tcllib

Graphical use:

An a by b rectangle can be covered with square tiles of side-length c,

only if c is a common divisor of a and b

gcd n m

Return the lowest common multiple of m and n

Straight from Lars Hellström's math::numtheory library in Tcllib

commonDivisors x y

Return a list of all the common factors of x and y

(equivalent to factors of their gcd)

hasglobs str

Return a boolean indicating whether str contains any of the glob characters: * ? [ ]

hasglobs uses append to preserve Tcls internal representation for str - so it should help avoid shimmering in the few cases where this may matter.

trimzero number

Return number with left-hand-side zeros trimmed off - unless all zero

If number is all zero - a single 0 is returned

substring_count str substring

Search str and return number of occurrences of substring

dict_merge_ordered defaults main

The standard dict merge accepts multiple dicts with values from dicts to the right (2nd argument) taking precedence.

When merging with a dict of default values - this means that any default key/vals that weren't in the main dict appear in the output before the main data.

This function merges the two dicts whilst maintaining the key order of main followed by defaults.

askuser question

A basic utility to read an answer from stdin

The prompt is written to the terminal and then it waits for a user to type something

stdin is temporarily configured to blocking and then put back in its original state in case it wasn't already so.

If the terminal is using punk::console and is in raw mode - the terminal will temporarily be put in line mode.

(Generic terminal raw vs linemode detection not yet present)

The user must hit enter to submit the response

The return value is the string if any that was typed prior to hitting enter.

The question argument can be manually colourised using the various punk::ansi funcitons

   set answer [punk::lib::askuser "[a+ green bold]Do you want to proceed? (Y|N)[a]"]
   if {[string match y* [string tolower $answer]]} {
       puts "Proceeding"
   } else {
       puts "Cancelled by user"
   }

linesort ?sortoption ?val?...? textblock

Sort lines in textblock

Returns another textblock with lines sorted

options are flags as accepted by lsort ie -ascii -command -decreasing -dictionary -index -indices -integer -nocase -real -stride -unique

list_as_lines ?-joinchar char? linelist

This simply joines the elements of the list with -joinchar

It is mainly intended for use in pipelines where the primary argument comes at the end - but it can also be used as a general replacement for join $lines <le>

The sister function lines_as_list takes a block of text and splits it into lines - but with more options related to trimming the block and/or each line.

lines_as_list ?option value ...? text

Returns a list of possibly trimmed lines depeding on options

The concept of lines is raw lines from splitting on newline after crlf is mapped to lf

- not console lines which may be entirely different due to control characters such as vertical tabs or ANSI movements

opts_values ?option value...? optionspecs rawargs

Parse rawargs as a sequence of zero or more option-value pairs followed by zero or more values

Returns a dict of the form: opts <options_dict> values <values_dict>

ARGUMENTS:

multiline-string optionspecs

This a block of text with records delimited by newlines (lf or crlf) - but with multiline values allowed if properly quoted/braced

'info complete' is used to determine if a record spans multiple lines due to multiline values

Each optionspec line must be of the form:

-optionname -key val -key2 val2...

where the valid keys for each option specification are: -default -type -range -choices -optional

list rawargs

This is a list of the arguments to parse. Usually it will be the \$args value from the containing proc