[para]Reconstructs and returns the raw line using the payload and per-line stored line-ending metadata
[para]Reconstructs and returns the raw line using the payload and per-line stored line-ending metadata
[para]A 'line' may be returned without a line-ending if the unerlying chunk had trailing data without a line-ending (or the chunk was loaded under a non-standard -policy setting)
[para]A 'line' may be returned without a line-ending if the unerlying chunk had trailing data without a line-ending (or the chunk was loaded under a non-standard -policy setting)
@ -91,6 +102,8 @@ or
[para]Return a dict of the metadata and text for the line indicated by the zero-based lineindex
[para]Return a dict of the metadata and text for the line indicated by the zero-based lineindex
[para]This returns the same info as the [method linemeta] with an added key of 'payload' which is the text of the line without line-ending.
[para]This returns the same info as the [method linemeta] with an added key of 'payload' which is the text of the line without line-ending.
[para]The 'payload' value is the same as is returned from the [method linepayload] method.
[para]The 'payload' value is the same as is returned from the [method linepayload] method.
<dd><p>Reconstructs and returns the raw line using the payload and per-line stored line-ending metadata</p>
<dd><p>Reconstructs and returns the raw line using the payload and per-line stored line-ending metadata</p>
<p>A 'line' may be returned without a line-ending if the unerlying chunk had trailing data without a line-ending (or the chunk was loaded under a non-standard -policy setting)</p>
<p>A 'line' may be returned without a line-ending if the unerlying chunk had trailing data without a line-ending (or the chunk was loaded under a non-standard -policy setting)</p>
@ -252,44 +271,51 @@ or
<dd><p>Return a dict of the metadata and text for the line indicated by the zero-based lineindex</p>
<dd><p>Return a dict of the metadata and text for the line indicated by the zero-based lineindex</p>
<p>This returns the same info as the <bclass="method">linemeta</b> with an added key of 'payload' which is the text of the line without line-ending.</p>
<p>This returns the same info as the <bclass="method">linemeta</b> with an added key of 'payload' which is the text of the line without line-ending.</p>
<p>The 'payload' value is the same as is returned from the <bclass="method">linepayload</b> method.</p></dd>
<p>The 'payload' value is the same as is returned from the <bclass="method">linepayload</b> method.</p></dd>
<dd><p>Return a list of dicts each with structure like the result of the <bclass="method">lineinfo</b> method - but possibly with extra keys for truncation information if -show_truncated 1 is supplied</p>
<dd><p>Return a list of dicts each with structure like the result of the <bclass="method">lineinfo</b> method - but possibly with extra keys for truncation information if -show_truncated 1 is supplied</p>
<p>The truncation key in a lineinfo dict may be returned for first and/or last line in the resulting list.</p>
<p>The truncation key in a lineinfo dict may be returned for first and/or last line in the resulting list.</p>
<p>truncation shows the shortened (missing bytes on left and/or right side) part of the entire line (potentially including line-ending or even partial line-ending)</p>
<p>truncation shows the shortened (missing bytes on left and/or right side) part of the entire line (potentially including line-ending or even partial line-ending)</p>
<p>Note that this truncation info is only in the return value of this method - and will not be reflected in <bclass="method">lineinfo</b> queries to the main chunk.</p></dd>
<p>Note that this truncation info is only in the return value of this method - and will not be reflected in <bclass="method">lineinfo</b> queries to the main chunk.</p></dd>
<dd><p>A helper to return any Tcl-style end end-x values given to startidx or endidx; converted to their specific values based on the current state of the underlying line data</p>
<dd><p>A helper to return any Tcl-style end end-x values given to startidx or endidx; converted to their specific values based on the current state of the underlying line data</p>
<p>This is used internally by API functions such as <bclass="method">line</b> to enable it to accept more expressive indices</p></dd>
<p>This is used internally by API functions such as <bclass="method">line</b> to enable it to accept more expressive indices</p></dd>
<dd><p>A helper to return any Tcl-style end end-x entries supplied to startidx or endidx; converted to their specific values based on the current state of the underlying chunk data</p></dd>
<dd><p>A helper to return any Tcl-style end end-x entries supplied to startidx or endidx; converted to their specific values based on the current state of the underlying chunk data</p></dd>
<dd><p>A utility to convert some of the of Tcl-style list-index expressions such as end, end-1 etc to valid indices in the range 0 to the supplied max</p>
<dd><p>A utility to convert some of the of Tcl-style list-index expressions such as end, end-1 etc to valid indices in the range 0 to the supplied max</p>
<p>Basic addition and subtraction expressions such as 4-1 5+2 are accepted</p>
<p>Basic addition and subtraction expressions such as 4-1 5+2 are accepted</p>
<p>startidx higher than endidx is allowed</p>
<p>startidx higher than endidx is allowed</p>
<p>Unlike Tcl's index expressions - we raise an error if the calculated index is out of bounds 0 to max</p></dd>
<p>Unlike Tcl's index expressions - we raise an error if the calculated index is out of bounds 0 to max</p></dd>
<dd><p>Takes start and end offset, generally representing bytes or character indices, and computes a list of boundaries at multiples of the chunksize that are spanned by the start and end range.</p>
<dd><p>Takes start and end offset, generally representing bytes or character indices, and computes a list of boundaries at multiples of the chunksize that are spanned by the start and end range.</p>
#[para]Returns a string displaying the boundaries at chunksize bytes between chunkstart and chunkend
#[para]Defaults to using ansi colour if punk::ansi module is available. Use -ansi 0 to disable colour
set defaults [dict create\
-ansi $::punk::fileline::ansi::enabled\
-offset 0\
-displaybytes 200\
-truncatedmark "..."\
-completemark "---"\
-moremark " + "\
-continuemark " > "\
-linemaxwidth 100\
-linebase 0\
-limit -1\
-boundaries {}\
-showconfig 0\
-boundaryheader {Boundary %i% at %b%}\
]
set known_opts [dict keys $defaults]
foreach {k v} $args {
if {$k ni $known_opts} {
error "[self]::chunk_boundary error: unknown option '$k'. Known options: $known_opts"
}
}
set opts [dict merge $defaults $args]
# -- --- --- --- --- ---
set opt_ansi [dict get $opts -ansi]
set opt_offset [dict get $opts -offset]
set opt_displaybytes [dict get $opts -displaybytes]
set opt_tmark [dict get $opts -truncatedmark]
set opt_cmark [dict get $opts -completemark]
set opt_linemax [dict get $opts -linemaxwidth]
set opt_linebase [dict get $opts -linebase]
set opt_linebase [string map [list _ ""] $opt_linebase]
set opt_limit [dict get $opts -limit] ;#limit number of boundaries to display
set opt_boundaries [dict get $opts -boundaries] ;#use pre-calculated boundaries if supplied
set opt_showconfig [dict get $opts -showconfig]
set opt_boundaryheader [dict get $opts -boundaryheader]
# -- --- --- --- --- ---
package require overtype
# will require punk::char and punk::ansi
if {"::punk::fileline::ansi::stripansi" ne [info commands ::punk::fileline::ansi::stripansi]} {
namespace eval ::punk::fileline::ansi {
namespace import ::punk::ansi::*
}
}
#This mechanism for enabling/disabling ansi is a bit clumsy - prone to errors with regard to keeping in sync with any api changes in punk ansi
#It's done here to allow this to be used without the full set of punk modules and/or shell - REVIEW
#risk of failing to reset on error
set pre_ansi_enabled $::punk::fileline::ansi::enabled
if {$opt_ansi} {
set ::punk::fileline::ansi::enabled 1
} else {
set ::punk::fileline::ansi::enabled 0
}
if {"::punk::fileline::stripansi" ne [info commands ::punk::fileline::stripansi]} {
proc ::punk::fileline::a {args} {
if {$::punk::fileline::ansi::enabled} {
tailcall ::punk::fileline::ansi::a {*}$args
} else {
return ""
}
}
proc ::punk::fileline::a+ {args} {
if {$::punk::fileline::ansi::enabled} {
tailcall ::punk::fileline::ansi::a+ {*}$args
} else {
return ""
}
}
proc ::punk::fileline::stripansi {str} {
if {$::punk::fileline::ansi::enabled} {
tailcall ::punk::fileline::ansi::stripansi $str
} else {
return $str
}
}
}
set maxline [lindex [my chunkrange_to_linerange $chunkend $chunkend] 0]
set minline [lindex [my chunkrange_to_linerange $chunkstart $chunkstart] 0]
#suport simple end+-int (+-)start(+-)int to set linebase to line corresponding to chunkstart or chunkend
#also simple int+int and int-int - nothing more complicated (similar to Tcl lrange etc in that regard)
#commonly this will be something like -start or -end
if {![string is integer -strict $opt_linebase]} {
set sign ""
set errunrecognised "unrecognised -linebase value '$opt_linebase'. Expected positive or negative integer or -start -start-int -start+int -end -end-int -end+int or -eof (where leading - is optional but probably desirable) "
if {[string index $opt_linebase 0] eq "-"} {
set sign -
set tail [string range $opt_linebase 1 end]
} else {
set tail [string trimleft $opt_linebase +];#ignore +
}
if {[string match eof* $tail]} {
set endmath [string range $tail 3 end]
#todo endmath?
if {$tail eq "eof"} {
set lastline [lindex [my chunkrange_to_linerange end end] 0]
set linebase ${sign}$lastline
} else {
error $errunrecognised
}
} elseif {[string match end* $tail]} {
set endmath [string range $tail 3 end]
if {[string length $endmath]} {
set op [string index $endmath 0]
if {$op in {+ -}} {
set operand [string range $endmath 1 end]
if {[string is integer -strict $operand]} {
if {$op eq "+"} {
set linebase [expr {$maxline + $operand}]
} else {
set linebase [expr {$maxline - $operand}]
}
} else {
error $errunrecognised
}
} else {
error $errunrecognised
}
} else {
set linebase $maxline
}
set linebase ${sign}$linebase
} elseif {[string match start* $tail]} {
set endmath [string range $tail 5 end]
if {[string length $endmath]} {
set op [string index $endmath 0]
if {$op in {+ -}} {
set operand [string range $endmath 1 end]
if {[string is integer -strict $operand]} {
if {$op eq "+"} {
set linebase [expr {$minline + $operand}]
} else {
set linebase [expr {$minline - $operand}]
}
} else {
error $errunrecognised
}
} else {
error $errunrecognised
}
} else {
set linebase $minline
}
set linebase ${sign}$linebase
} elseif {[string match *-* $tail]} {
set extras [lassign [split $tail -] int1 int2]
if {[llength $extras]} {
error $errunrecognised
}
if {![string is integer -strict $int1] || ![string is integer -strict $int2]} {
error $errunrecognised
}
set linebase [expr {$int1 - $int2}]
set linebase ${sign}$linebase
} elseif {[string match *+* $tail]} {
set extras [lassign [split $tail +] int1 int2]
if {[llength $extras]} {
error $errunrecognised
}
if {![string is integer -strict $int1] || ![string is integer -strict $int2]} {
set rhs [string range $payload_and_le $split+1 end]
set rhs [string range $payload_and_le $split+1 end]
dict set last truncated $truncated
dict set last truncated $truncated
dict set last truncatedside [list right]
dict set last truncatedright $rhs
dict set last truncatedright $rhs
#this has the effect that truncating the rhs by 1 can result in truncated being larger than original payload for crlf lines - as payload now sees the cr
#this has the effect that truncating the rhs by 1 can result in truncated being larger than original payload for crlf lines - as payload now sees the cr
#this is a bit unintuitive - but probably best reflects the reality. The truncated value is the truncated 'line' rather than the truncated 'payload'
#this is a bit unintuitive - but probably best reflects the reality. The truncated value is the truncated 'line' rather than the truncated 'payload'
#even without split crlf - this can overcount by counting the lf or crlf in a line which had an ending not in the chunk range specified
#check first and last infoline for truncations
#Also check if the truncation is directly between an crlf
#both an lhs split and an rhs split could land between cr and lf
#to be precise - we should presumably count the part within our chunk as either a none for cr or an lf
#This means a caller counting chunk by chunk using this method will sometimes get the wrong answer depending on where crlfs lie relative to their chosen chunk size
#This is presumably ok - as it should be a well known thing to watch out for.
#If we're only receiving chunk by chunk we can't reliably detect splits vs lone <cr>s in the data
#There are surely more efficient ways for a caller to count line-endings in the way that makes sense for them
#but we should makes things as easy as possible for users of this line/chunk structure anyway.
set first [lindex $infolines 0]
if {[dict get $first is_truncated]} {
#could be the only line - and truncated at one or both ends.
#both a left and a right truncation could split a crlf
}
set last [lindex $infolines end]
if {[dict get $first lineindex] != [dict get $last lineindex]} {
#only need to process last if it is a different line
#[para]Takes start and end offset, generally representing bytes or character indices, and computes a list of boundaries at multiples of the chunksize that are spanned by the start and end range.
#[para]Takes start and end offset, generally representing bytes or character indices, and computes a list of boundaries at multiples of the chunksize that are spanned by the start and end range.
#target labels can't span lines with ^ - cmd doesn't recognise them (They possibly do span in a way - but but the newlines may be included in the label - so they may be hard/impossible to call).
#Note that we can't filter obviously non-batch-script lines before processing - as the way batch label-scanning works it scans in chunks of 512 bytes so all lines are relevant.
#This means label-like things could be incorrectly found in other script data - that's partly the point of this check
#Note that we can't filter obviously non-batch-script lines before processing - as the way batch label-scanning works it scans in chunks of 512 bytes so all lines are relevant.
#This means label-like things could be incorrectly found in other script data - that's partly the point of this check.
if {[regexp $search_regex $callingline_payload _m precall call labelplus]} {
#todo further checks to see if it's actually a batch script line
# - - - - work out what cmd.exe considers start of 512B boundaries when scanning from a callsite
#callposn affected by newlines?
#callposn affected by newlines?
#set callposn [expr {$file_offset + [string length $callingline_payload]}] ;#take callposn as end of line .. review - multiline statements?
set callposn [expr {$file_offset + $callingline_len}]
#Note there are anomalies around target labels in bracketed sections such as IF blocks
#this is bad practice - as it can lead to unbalanced braces - but batch files can still work under cmd.exe with them in some cases
#e.g unbalanced trailing bracket may be ignored.
#A working script with target-labels in braces can fail due to boundary issues we don't detect (callsite for boundary counting may need to be at end of entire multiline if block??)
#For now - just make sure punk templates don't do this - but it would be nice to be able to detect.
#todo - multiple calls on one line. - also - determine what cmd considers the starting point for forward scanning when call is in a structure such as an if statement.
set callsite_labelfound 0 ;#until proven
if {$callposn != -1} {
if {$callposn != -1} {
puts stdout "[a+ bold cyan]CALLSITE on line $linenum ending at byte $callposn[a]"
set callposn_lineindex [lindex [$objFile chunkrange_to_linerange $callposn $callposn] 0]
#the line represented by callposn may actually be beyond the calling_line_index
set labelinfo [batchlib::get_callsite_label $labelplus]
if {[dict get $labelinfo labelfound]} {
set callsite_labelfound 1
set label [dict get $labelinfo label]
set call_label_record [list label $label line $callingline_num]
#todo - multiple calls on one line. - also - determine what cmd considers the starting point for forward scanning when call is in a structure such as an if statement.
if {$callsite_labelfound} {
puts stdout "[a+ bold cyan]CALLSITE on line $callingline_num ending at byte $callposn[a]"
set callsummary [string range "${call}${labelplus}" 0 100]
set callsummary [string range "${call}${labelplus}" 0 100]
if {[string length $callsummary] < [string length ${call}${labelplus}]} {
if {[string length $callsummary] < [string length ${call}${labelplus}]} {
puts stdout " CALLSITE: $callsummary (truncated to 100 bytes)"
puts stdout " CALLSITE: $callsummary (truncated to 100 bytes)"
set labelpluswords [regexp -inline -all {\S+} $labelplus] ;#don't assume labelplus can be treated as Tcl list - use regexp to split
#set labelpluswords [regexp -inline -all {\S+} $labelplus] ;#don't assume labelplus can be treated as Tcl list - use regexp to split
#NOTE it is invalid to assume label always terminated by space - pair of % characters (variable substitution) can contain a space without terminating label
#NOTE it is invalid to assume label always terminated by space - pair of % characters (variable substitution) can contain a space without terminating label
#set word1 [lindex $labelpluswords 0]
set word1 [lindex $labelpluswords 0]
set word1len [string length $word1]
set labeltail [string range $labelplus $word1len end]
if {[string index $word1 end] eq "^"} {
if {![string length $labeltail]} {
#label
}
} else {
}
#todo batchlib::get_callsite_label $labelplus
##################################
##################################
set label $word1
set labelsize [string length $label]
set labelsize [string length $label]
#scan forward for labels at boundaries
#scan forward for labels at boundaries
set forward_chunk [$objFile chunk $callposn end]
set forward_chunk [$objFile chunk $callposn end]
set forward_chunk_base $callposn ;#name for clarity
incr callid
incr callid
set callvar "call-${callid}_fromline-${linenum}"
set callvar "call-${callid}_fromline-${callingline_num}"
upvar 0 $callvar objForwardScan
upvar 0 $callvar objForwardScan
set objForwardScan [fileline::textinfo new $forward_chunk]
set objForwardScan [fileline::textinfo new $forward_chunk]
puts stdout "[a+ bold red]ERROR: file line: $line_num_global target-label $trimscanline at line-beginning and with byte offset from callsite: $labelposn offset in file: $line_start_global[a]"
puts stdout "[a+ bold red] This target-label appears to span the 512byte boundary at byte $ubound[a] [a+ yellow bold]from callsite.[a]"
puts [$objForwardScan chunk_boundary_display [dict get $scanlineinfo start] [dict get $scanlineinfo end] 512 -linebase $callposn_lineindex+1 -limit 1] ;#+1 on callposn_linindex to do editor-style linenums
} else {
dict set scan_target_label_record ok 1
puts stdout "[a+ bold green]OK: file line: $line_num_global target-label $trimscanline at line-beginning and with byte offset from callsite: $labelposn offset in file: $line_start_global[a]"
set forwardbline_boundaries [dict get $forwardbline_spaninfo boundaries]
foreach b $forwardbline_boundaries {
set relb [expr $b + $scanner_offset]
if {$relb <= [dict get $forwardbline_info end]} {
lappend boundary_positions $relb
} else {
#leave it for the next line - as we may need to adjust offset anyway
break
}
}
}
if {[dict get $forwardbline_info le] eq "crlf"} {
set scanner_offset [expr {[dict get $forwardbline_info end] - [lindex $boundary_positions end]}] ;#reset on crlf
#puts "+++++ set scanner_offset $scanner_offset"
}
set scanner_position [dict get $forwardbline_info end]
}
}
set boundary_positions [lsearch -all -not -inline $boundary_positions 0]
if {[llength $boundary_positions]} {
puts stdout "[a+ blue bold]----> [llength $forward_lines] forward lines, boundaries (possibly with offsets) to check $boundary_positions[a]"
} else {
puts stdout "[a+ blue bold]----> [llength $forward_lines] forward lines, No boundaries to check (generally expected for files with crlf line endings and no extremely long lines)[a]"
}
if {[llength $boundary_positions]} {
puts stdout "line $callingline_num scan from call label $label ending at position $callposn. Next Callsite-relative boundary [lindex $boundary_positions 0]"
for {set i 0} {$i < [llength $boundary_positions]} {incr i} {
set b [lindex $boundary_positions $i]
if {$i < [llength $boundary_positions]-1} {
set nextb [lindex $boundary_positions $i+1]
set top $nextb
} else {
set top end
}
set forwardbline_infolist [$objForwardScan chunkrange_to_lineinfolist $b $top -show_truncated 1]
set forwardbline_info [lindex $forwardbline_infolist 0]
if {[dict get $forwardbline_info is_truncated]} {
set payload_from_boundary [dict get $forwardbline_info truncated]
} else {
set payload_from_boundary [dict get $forwardbline_info payload]
}
set forwardbline_len [dict get $forwardbline_info linelen]
set forwardbline_index [dict get $forwardbline_info lineindex]
set forwardbline_start [dict get $forwardbline_info start]
set forwardbline_start_global [expr {$forward_chunk_base + $forwardbline_start}]
set forwardbline_index_global [lindex [$objFile chunkrange_to_linerange $forwardbline_start_global $forwardbline_start_global] 0]
set forwardbline_num_global [expr {$forwardbline_index_global + 1}]
set found_targetlabel_at_boundary 0
if {[string first : $payload_from_boundary] >= 0} {
#puts stdout "Possible label at boundary $b - testing"
set labelinfo [batchlib::get_target_label_from_line $payload_from_boundary]
if {[dict get $labelinfo labelfound] && [dict get $labelinfo label] eq $label} {
incr found_forward_label
set found_targetlabel_at_boundary 1
} elseif {[dict get $labelinfo labelfound]} {
set unsearched_label [dict get $labelinfo label]
puts stderr "[a+ cyan]Line $forwardbline_num_global: Found an item that cmd may interpret as a target label because of its location at a boundary $b - but it doesn't seem to be the one we are looking for. Looking for '$label' Found: '[dict get $labelinfo label]'[a]"
puts stderr "[a+ yellow]Warning - if the label '$unsearched_label' on line $forwardbline_num_global isn't meant to be a target - it may be safest to make sure batch script isn't using CALL or GOTO with target :$unsearched_label"
puts stdout "linedata:\n"
#puts stdout "'$payload_from_boundary'"
puts [$objForwardScan chunk_boundary_display [dict get $forwardbline_info start] [dict get $forwardbline_info end] 0 -boundaries $b -linebase $callposn_lineindex+1 -limit 1]
#dubious value to check call_labels_found - as we didn't run through and find all call labels first!
if {$unsearched_label in [dict keys $call_labels_found]} {
set boundary_target_label_record [list label $unsearched_label line $forwardbline_num_global error found_via_boundary_check_on_a_different_call_label]
set note "possibly unreliable or dangerous target-label at line $forwardbline_num_global may execute line [expr {$forwardbline_num_global +1}].\n"
append note "Target label not at line start but was found by scanning 512byte chunks from callsite with count resets at any crlf encountered\n"
append note "Adjust spacing between line $callingline_num and $forwardbline_num_global to avoid the 512 boundary - and re-test for other boundary problems"
puts stdout "[a+ bold red]ERROR: line $forwardbline_num_global target-label [dict get $labelinfo rawlabel] found at boundary and with byte offset from callsite: $b [a]"
puts stdout "[a+ bold red] This target-label appears to fall at or just after the 512byte boundary at byte $b[a] [a+ yellow bold]from callsite.[a]"
puts stdout "[a+ bold yellow]Code may execute at line [expr {$forwardbline_num_global + 1}] (or at next 512Byte boundary in some circumstances)[a]"
puts stdout "[a+ bold yellow]Recommend adjusting spacing between line $callingline_num and $forwardbline_num_global[a]"
puts stdout [$objForwardScan chunk_boundary_display [dict get $forwardbline_info start] [dict get $forwardbline_info end] 0 -boundaries $b -linebase $callposn_lineindex+1 -limit 1]
}
#if found any label - peek at next boundary
if {[dict get $labelinfo labelfound] && $i+1 < [llength $boundary_positions]} {
set next_lineinfolist [$objForwardScan chunkrange_to_lineinfolist $nextb end -show_truncated 1]
set next_lineinfo [lindex $next_lineinfolist 0]
puts "peek next boundary data - line [expr {$forwardbline_num_global + 1}]:"
#if {[dict get $next_lineinfo is_truncated]} {
# puts [dict get $next_lineinfo truncated]
#} else {
# puts [dict get $next_lineinfo payload]
#}
puts [$objForwardScan chunk_boundary_display [dict get $next_lineinfo start] [dict get $next_lineinfo end] 0 -boundaries $nextb -linebase $callposn_lineindex+1 -limit 1]
}
}
}
}
$objForwardScan destroy
#scan behind for labels at boundaries - using offset from start of file
#scan behind for labels at boundaries - using offset from start of file
#we do a backward scan even if a forward label has been found, so that we can warn of duplicate labels.
#we do a backward scan even if a forward label has been found, so that we can warn of duplicate labels.
set prior_start 0
set prior_start 0
set prior_end $lineindex ;#only scan from file start to call-site
set prior_end $callingline_index ;#only scan from file start to call-site
set prior_total_offset 0
set pline_begin 0
set found_backward_label 0
set found_backward_label 0
set p_linenum 0
set p_linenum 0
for {set pidx $prior_start} {$pidx <= $prior_end} {incr pidx} {
for {set pidx $prior_start} {$pidx <= $prior_end} {incr pidx} {
#the following is a valid target for @GOTO :#something
puts stdout "[a+ bold red]ERROR: label '$trimpline' at line $p_linenum and offset from file start: $prior_label_posn total offset: $prior_total_offset[a]"
#: ;#something
puts stdout "[a+ bold red] This label appears to span the 512byte boundary at byte $p_ubound[a] [a+ yellow bold]from file start[a]"
} else {
#It is possible for closing bracket ) to also be invisible if there is no open ( active
puts stdout "[a+ bold green]OK: prior label '$trimpline' at offset from file start: $prior_label_posn total offset: $prior_total_offset[a]"
#This only seems to work for a single ) at beggining of the line multiple ) even separated by spaces or ; etc seem to stop the target being found.
#The lone unbalanced ) can act like a comment in other contexts - and can appear multiple times, but only if first ) on the line is followed by a delimiter
#Essentially all characters following the first ) are ignored - but if the first is something like )) then cmd tries to interpret that as a command and fails
# e.g
#) ignored
#);)))) ignored
#)) causes error as cmd tries to run "))" as a command.
#This is a reason why *target* labels shouldn't appear in bracketed blocks - as code jumps to a point where ( ) will be unbalanced
#target labels are literal with regards to % ie not subject to % expansion - but ^ must still be processed
if {[string first : $pline] >= 0} {
#space (and some other chars) allowed between colon and label at target - (but not at callsite)
set labelinfo [batchlib::get_target_label_from_line $pline]
if {[dict get $labelinfo labelfound] && [dict get $labelinfo label] eq $label} {
set target_label_record [list label $label line $p_linenum]
puts stdout "$labelinfo"
incr found_backward_label
set prior_label_posn_in_line [string first : $pline]
set prior_label_posn [expr {$pline_begin + $prior_label_posn_in_line}]
if {($prior_label_posn % 512) == 0} {
set p_ubound [expr {($prior_label_posn / 512) * 512}]
} else {
set p_ubound [expr {(($prior_label_posn /512) +1) * 512}]
}
set p_lbound [expr {$p_ubound - $labelsize}]
if {($prior_label_posn >= $p_lbound) && ($prior_label_posn <= $p_ubound)} {
dict set target_label_record error linestart_and_overlap
puts stdout "[a+ bold red]ERROR: target-label '$trimpline' at line $p_linenum and offset from file start: $prior_label_posn line start: $pline_begin[a]"
puts stdout "[a+ bold red] This target-label appears to span the 512byte boundary at byte $p_ubound[a] [a+ yellow bold]from file start[a]"
puts [$objFile chunk_boundary_display [dict get $plineinfo start] [dict get $plineinfo end] 512 -linebase 1 -limit 1]
} else {
dict set target_label_record ok 1
puts stdout "[a+ bold green]OK: file line: $p_linenum target-label '$trimpline' before call from line $callingline_num. Target is at offset from file start: $prior_label_posn line start: $pline_begin[a]"
#overlap test is just a warning - we have a label-like thing overlapping the boundary
#overlap test is just a warning - we have a label-like thing overlapping the boundary
#todo - take account of fact that target label can be ": <whitespace or label-ignorable chars> labelname" - so using just labelsize won't detect all overlaps
#The label could even be at the end of a long line that appears at first to be a comment e.g something like
puts stdout "[a+ bold yellow] WARNING: possible label $label spans boundary $b from start of file"
puts stdout "[a+ bold yellow] WARNING: possible label $label spans boundary $b from start of file"
}
}
set pline_tail [string range $pline $b end]
set pline_tail [string range $pline $b end]
#if {[string match ":$label *" $pline_tail]} {}
set re1 {\s*:%lbl%[\s|^|=].*}
if {[string first : $pline_tail] >= 0} {
set re1 [string map [list %lbl% $label] $re1]
set labelinfo [batchlib::get_target_label_from_line $pline_tail]
set re2 {\s*:%lbl%$}
set labelfound 0
set re2 [string map [list %lbl% $label] $re2]
if {[dict get $labelinfo labelfound] && [dict get $labelinfo label] eq $label} {
if {[regexp $re1 $pline_tail] || [regexp $re2 $pline_tail]} {
set labelfound 1
lappend error_labels [list label $label file_offset_bytes $b note "label at boundary but no preceeding newline - cmd may interpret as label and execute following line or code at next boundary" callsite [list call ${call}${labelplus} call_linenum $linenum]]
} elseif {[dict get $labelinfo labelfound]} {
puts stdout "[a+ bold red]ERROR: *possible* label '$label' at line $p_linenum and offset from file start: $b total offset: $prior_total_offset[a]"
puts stdout "Note: detected target label [dict get $labelinfo label] at file offset $b at boundary with no preceeding newline - but it's not the one we're currently scanning for"
puts stdout "[a+ bold red] This label with no preceeding newline appears to span the 512byte boundary at byte $b[a] [a+ yellow bold]from file start[a]"
set tail_end [expr {$b + [string length $pline_tail]}]
incr found_backward_label
set tail_spaninfo [fileline::range_spans_chunk_boundaries $tail_start $tail_end 512]
if {[dict get $tail_spaninfo is_span]} {
lappend error_labels [list label $label_found_name file_offset_bytes $b note "label at boundary but no preceeding newline - cmd may interpret as label and execute following line or code at next boundary" callsite [list call ${call}${labelplus} call_linenum $callingline_num]]
set tail_boundaries [dict get $tail_spaninfo boundaries]
set extra_tail_boundaries [lsearch -all -inline -not $tail_boundaries $b]
puts stdout "[a+ bold red]ERROR: *possible* label '$label_found_name' at line $p_linenum and offset from file start: $b line start: $pline_begin[a]"
if {[llength $extra_tail_boundaries]} {
puts stdout "[a+ bold red] This label with no preceeding newline appears to span the 512byte boundary at byte $b[a] [a+ yellow bold]from file start[a]"
puts "Line $p_linenum also spans additional boundaries: $extra_tail_boundaries"
puts stdout "[a+ bold red] cmd.exe may find this label - but it probably shouldn't be relied upon[a]"
set next_boundary [lindex $extra_tail_boundaries 0]
puts stdout "[a+ green bold]OK[a] - target-label $targetkey appears on immediately adjacent lines $previous and $lnum - assuming it is a boundary-avoidance tactic rather than an inadvertent duplicate"
set remaining [lrange $remaining 0 end-1];#retain latest - we will allow a run of targets on subsequent lines
puts stdout "[a+ yellow bold]Warning - target for label $pb was found with a record as being possibly bogus. record: [dict get $possible_target_labels_found $pb][a]"
puts stdout "[a+ yellow bold]Consider moving this target-label and re-checking[a]"
puts stdout "[a+ yellow bold]It may be a call label line that was found by boundary scanning - which shouldn't really happen[a]"
puts stdout "Call record [dict get $call_labels_found $pb]"
puts stdout "These are usually nothing to be concerned about. Some will almost always turn up in a polyglot script that contains batch script."
puts stdout "If some of the label names appear to contain newlines, or are prefixes of or exact matches with legitimate labels - you might consider adjusting the boundary spacing with whitespace or comments to get a different result."
}
}
set result ""
if {[llength $warning_labels]} {
if {[llength $warning_labels]} {
append result "WARNING:" \n
append result "WARNING:" \n
append result "The following labels had warnings" \n
append result "The following labels had warnings" \n
#see also: https://www.dostips.com/forum/viewtopic.php?t=3803 'Rules for label names vs GOTO and CALL
# review - we may need different get_callsite_label functions?
proc get_callsite_label {labelstart} {
#labelstart is the character immediately following the colon (which is optional at callsite) - a label such as ::label doesn't seem valid at call or target sites
#e.g for @goto %= possible comment=% :mylabe%%l etc
#we would expect to be passed only "mylabe%%1 etc"
#It is up to the caller to determine where a callsite label begins.
#note that:
#@REM -----
#@goto ^
#:label
#@REM-----
# is a valid callsite - but doesn't appear to be found by the label scanner as it's own target label even though :label is on it's own line from non-batch perspective
# so the caller will have to do some batch-style line processing to find all call sites
#Also, for the following 2 lines
#@REM ^
#:label
# the label will be found - yet if the :label was a command such as @GOTO - it would not be run as a callsite
#a quick'n'dirty fix for some ways various escapes are handled within labels at callsite.
#a quick'n'dirty fix for some ways various escapes are handled within labels at callsite.
#There seem to be very different rules for labels at target site - presumably because they are not part of a command
#There seem to be very different rules for labels at target site - presumably because they are not part of a command
# Mostly it seems target labels are more literal
# Mostly it seems target labels are more literal with regards to % chars - but ^ are processed the same way at target label
#some rules..
#some rules..
#callsite labels can't have space between : and label - but target labels can
#callsite labels can't have space between : and label - but target labels can
#label terminated by =,: even if prefixed by ^ and even if in squotes or dquotes
#label terminated by =,: even if prefixed by ^ and even if in squotes or dquotes
@ -18,26 +18,30 @@ set -- "$@" "a=[list shebangless punk MULTISHELL tclsh sh bash cmd pwsh powershe
@REM On unix-like systems, call with sh, bash or tclsh. (powershell untested on unix - and requires wscript if security elevation is used)
@REM On unix-like systems, call with sh, bash or tclsh. (powershell untested on unix - and requires wscript if security elevation is used)
@REM Due to lack of shebang (#! line) Unix-like systems will probably (hopefully) default to sh if the script is called without an interpreter - but it may depend on the shell in use when called.
@REM Due to lack of shebang (#! line) Unix-like systems will probably (hopefully) default to sh if the script is called without an interpreter - but it may depend on the shell in use when called.
@REM If you find yourself really wanting/needing to add a shebang line - do so on the basis that the script will exist on unix-like systems only.
@REM If you find yourself really wanting/needing to add a shebang line - do so on the basis that the script will exist on unix-like systems only.
@rem asadmin is for automatic elevation to administrator. Separate window will be created (seems unavoidable with current elevation mechanism) and user will still get security prompt (probably reasonable).
@rem asadmin is for automatic elevation to administrator. Separate window will be created (seems unavoidable with current elevation mechanism) and user will still get security prompt (probably reasonable).
:<asadmin>
:<asadmin>
@SET"asadmin=0"
@SET"asadmin=0"
:</asadmin>
:</asadmin>
@REM nextshell set to pwsh,sh,bash or tclsh
@REM nextshell set to index for validshells .eg 10 for pwsh
@REM -- cmd/batch file section (ignored on unix but should be left in place)
@REM -- This section intended mainly to launch the next shell (and to escalate privileges if necessary)
@REM -- This section intended mainly to launch the next shell (and to escalate privileges if necessary)
@REM -- Avoid customising this if you are not familiar with batch scripting. cmd/batch script is useful, but is probably the least expressive language and most error prone.
@REM -- Avoid customising this if you are not familiar with batch scripting. cmd/batch script can be useful, but is probably the least expressive language and most error prone.
@REM -- For example - as this file needs to use unix-style lf line-endings - the label scanner is susceptible to the 512Byte boundary issue: https://www.dostips.com/forum/viewtopic.php?t=8988#p58888
@REM -- For example - as this file needs to use unix-style lf line-endings - the label scanner is susceptible to the 512Byte boundary issue: https://www.dostips.com/forum/viewtopic.php?t=8988#p58888
@REM -- This label issue can be triggered/abused in files with crlf line endings too - but it is less likely to happen accidentaly.
@REM -- This label issue can be triggered/abused in files with crlf line endings too - but it is less likely to happen accidentaly.
@REm -- See also: https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133
@REm -- See also: https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133
@ -59,7 +63,6 @@ set -- "$@" "a=[list shebangless punk MULTISHELL tclsh sh bash cmd pwsh powershe
puts stderr "fcat WARNING: apparent option $opt found after file argument(s) (expected them before filenames). Passing to fileutil::cat anyway - but for at least some versions, these options may be ignored. commandline 'fcat $args'"
}
}
if {$::tcl_platform(platform) ne "windows"} {
return [fileutil::cat {*}$args]
}
set finalpaths [list]
set finalpaths [list]
foreach p $paths {
foreach p $paths {
if {$has_winpath && [punk::winpath::illegalname_test $p]} {
if {$has_winpath && [punk::winpath::illegalname_test $p]} {