flormat.pro
13.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
;+
; NAME:
; FLORMAT
;
; AUTHOR:
; Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770
; Craig.Markwardt@nasa.gov
;
; PURPOSE:
; Format a string with named format variables
;
; CALLING SEQUENCE:
; RESULT = FLORMAT(FORMAT, [ struct ], [x=x, y=y, ...], [_EXTRA=struct])
;
; DESCRIPTION:
;
; The function FLORMAT is used to easily insert a set of named
; parameters into a string using simple format codes. The key point
; is that format strings use *named* parameters instead of the
; position in the string.
;
; FLORMAT makes it easy to make maintainable and understandable
; format codes. FLORMAT is a convenience routine, which will be most
; suitable for formatting tabular output, but can be used for any
; complicated string formatting job where the positional parameters
; of STRING() become hard to manage. Users of Python will recognize
; FLORMAT as implementing "string interpolation."
;
; The user passes a format string similar to the IDL printf-style
; format string (i.e. using modified "%" notation), and a set of
; named fields either by passing a structure, keywords, or both. The
; output strings are composed by inserting the named fields into the
; format string with any requested formatting.
;
; The function FLORMAT is equivalent to the STRING(...,FORMAT=fmt)
; method of formatting a string, where the format string is allowed
; to have the name of the variable.
;
; Let us consider an example of formatting a time with hours, minutes
; and seconds into a string as HH:MM:SS. One could use FLORMAT()
; like this,
;
; result = flormat('%(hour)02d:%(min)02d:%(sec)02d', $
; hour=hour, min=min, sec=sec)
;
; The variables HOUR, MIN and SEC are allowed to be scalars or
; vectors. The key point here is that the format string contains the
; *named* keyword variables (or structure entries). Unlike STRING(),
; the actual variables can be passed in any order, since the format
; string itself describes in what order the values will be assembled.
; This is similar to string interpolation in Python.
;
; The same variable can appear multiple times in the format string,
; but the user only need to specify that variable once. For example,
;
; result = flormat('<A="%(href)s">Download %(href)s</A>', $
; href='filename.txt')
;
; Note that HREF appears twice in the format string.
;
; INPUT VARIABLES:
;
; FLORMAT() allows you to pass in the values as named keywords as
; shown above, where the keyword values are arrays, or by passing in
; an array of structures. A similar example to the one above is,
;
; S = replicate({hour: 0, min: 0, sec: 0}, 100)
; ; ... fill the structure S with 100 time values ...
; result = flormat('%(hour)02d:%(min)02d:%(sec)02d', s)
;
; In this case S is an array of structures, and the result will be an
; array of strings with the same number of elements as S.
;
; Compare this with standard IDL where a FOR-loop is required, no
; repetition is permitted, and it is difficult to see which format
; code corresponds to which variable. For example,
;
; for i = 0, n_elements(hour)-1 do begin
; result(i) = string(hour(i), min(i), sec(i), $
; format='(%"%02d:%02d:%02d")')
;
; The input structure STRUCT may be an array of structures or a
; structure of arrays. It is also possible pass *both* a structure
; STRUCT and keywords. The important thing is that the each keyword
; and each STRUCT.FIELD must evaluate to the same number of
; elements. If they don't, then the smallest number of elements is
; used.
;
; PRINTF-STYLE FORMAT CODES
;
; FLORMAT() uses format codes in either C printf-style format codes
; (the default), or a new "$" shell-style syntax if /SHELL_STYLE$ is
; set.
;
; FLORMAT() assumes that by default the C printf-style format codes
; are passed. FLORMAT() uses a slightly short-hand notation for
; print-style format codes which saves some space and is more
; flexible.
;
; Standard printf-style format codes are of the form,
; FORMAT='(%"...format here...")' ;; Standard IDL
; The FLORMAT printf-style format codes simply dispense with the
; redundant parentheses and percent symbol,
; FORMAT='...format here...' ;; FLORMAT notation
; This notation improves the readability of the format string, since
; only the actual format string needs to be present. Also, this
; notation does not embed one set of quotation marks within another,
; as the standard IDL notation does, so format strings with quotation
; marks will be easier to compose.
;
; Standard IDL format codes look like this,
; %s - string
; %d - integer
; %04d - integer zero-padded to 4 spaces, etc
;
; The new FLORMAT format strings look like this,
;
; %(name)s - string based on variable named NAME
; %(value)d - integer based on variable named VALUE
; %(index)04d - integer based on variable named INDEX,
; zero-padded to 4 spaces
;
; As you can see, the only difference is the addition of the variable
; name in parenthesis. These names are looked up in the input
; keywords and/or structure passed to FLORMAT().
;
; SHELL-STYLE FORMAT CODES
;
; Shell style "$" is a convenience notation when strict formatting is
; less important. Shell-style "$" format strings will be signaled by
; setting the SHELL_STYLE$ keyword. Note the trailing dollar-sign
; '$'. The format coes will look like this,
;
; $name - variable named NAME will be placed here
; $value - variable named VALUE will be placed here, etc.
;
; This is exactly how Unix shell string interpolation works.
; Variables are substituted into place using their "natural" format
; code, based on the variable type.
;
; result = flormat('<A=\"$href\">Download $href</A>', /shell_style$, $
; href='filename.txt')
;
; Note that quotation marks still need to be escaped as \", just the
; same as calling STRING() or PRINT with a %-style format string.
;
; CAVEATS:
;
; FLORMAT() is a convenience routine meant mostly to improve the
; readability and maintainability of format codes. FLORMAT() is not
; meant for high performance applications. It spends time parsing
; the input format string. It also spends memory building up a
; temporary output structure. However, for most applications such as
; constructing tables of up to thousands of entries, FLORMAT() should
; be perfectly adequate.
;
; The name "FLORMAT" is a play on the words "floor-mat" and "format."
; The "L" in FLORMAT can be thought of standing for "long-form" IDL
; format codes.
;
; PARAMETERS:
;
; FORMAT - format string used to
;
; STRUCT - input structure containing named entries. This should
; either be an array of structures, with each field
; containing a scalar; or, a structure where each field
; contains an array with the same number of elements.
;
; RETURNS:
;
; The resulting formatted strings. The return value will be an
; array of strings containing the same number of elements as passed
; as input.
;
; KEYWORD PARAMETERS:
;
; SHELL_STYLE$ - if set, then the format string is a shell-style
; string.
;
; All named keywords are available to be used as named formats in
; your format code. Values may be either scalar, or vector.
; Vectors dimensions must match the dimensions of STRUCT (if
; STRUCT is passed).
;
; EXAMPLE:
;
;
; ; Additional examples appear above.
;
; SEE ALSO:
;
; STRING, Format codes, C print-style format codes
;
; MODIFICATION HISTORY:
; Written, CM, 14 Sep 2009
; Finalized and documented, CM, 08 Dec 2011
;
; $Id: flormat.pro,v 1.9 2013/03/16 23:29:40 cmarkwar Exp $
;
;-
; Copyright (C) 2011, Craig Markwardt
; This software is provided as is without any warranty whatsoever.
; Permission to use, copy, modify, and distribute modified or
; unmodified copies is granted, provided this copyright and disclaimer
; are included unchanged.
;-
pro flormat_structcheck, s, n, tn
COMPILE_OPT strictarr
tn = ''
if n_elements(s) EQ 0 then return
tp = size(s,/type)
if tp NE 8 then message, 'ERROR: input variable must be a structure'
if n_elements(n) EQ 0 then n = n_elements(s.(0))
tn = tag_names(s)
nt = n_elements(tn)
for i = 1, nt-1 do begin
n = n < n_elements(s.(i))
endfor
return
end
function flormat, format0, s0, _EXTRA=extra, shell_style$=shell, $
format_am_pm=am_pm, format_days_of_week=days_of_week, $
format_months=months
COMPILE_OPT strictarr
if n_params() LT 1 AND n_elements(extra) EQ 0 then begin
USAGE:
message, 'USAGE: string = FLORMAT(FORMAT, struct) or', /info
message, ' string = FLORMAT(FORMAT, x=x, y=y, ...) or', /info
message, ' string = FLORMAT(FORMAT, _EXTRA=struct)', /info
return, ''
endif
;; FORMAT must be a scalar
tp = size(format0,/type)
if tp NE 7 OR n_elements(format0) GT 1 then begin
message, 'ERROR: FORMAT must be a scalar format string'
endif
fmt = format0[0]
if n_elements(s0) EQ 0 AND n_elements(extra) EQ 0 then begin
message, 'ERROR: you must either specify a structure or keywords'
endif
;; Do data-checking and also compute the total number of elements
flormat_structcheck, s0, n, tn0
flormat_structcheck, extra, n, tn1
;; Decide on whether it is a (%"") C-style or ($"") shell type format string
fmt_type = '%'
if keyword_set(shell) then fmt_type = '$'
;; Regular expression for %(varname) or $varname splitting
;; Example: "blah blah %(varname) blah blah"
;; splits to "blah blah " and " "
;; Example: "blah blah $varname blah blah"
;; splits to "blah blah " " blah blah"
;;
regex = '%\([^)]*\)' ;; %(varname)
if fmt_type EQ '$' then begin
regex = '('+regex+'|\$[a-zA-Z_][a-zA-Z0-9_]*)' ;; or $varname
endif
spos = strsplit(fmt, regex, /regex, /preserve_null, length=slen)
ninterp = n_elements(spos)-1
;; No special format codes requested, so return immediately
if ninterp EQ 0 then return, fmt
;; Separate the format string into the "surrounding" string data (FMTS)
;; and the interpolation data (FMTI).
fmts = strmid(fmt, spos, slen)
ipos = spos+slen
ilen = spos[1:*] - ipos
ipos = ipos[0:ninterp-1]
case fmt_type of
'%': begin ;; %(NAME) -> NAME
ipos = ipos + 2
ilen = ilen - 3
end
'$': begin ;; $NAME -> NAME
ipos = ipos + 1
ilen = ilen - 1
end
endcase
fmti = strmid(fmt, ipos, ilen)
for i = 0, ninterp-1 do begin
varname = fmti[i]
;; Check structure
wh = where(strupcase(varname) EQ tn0, ct)
if ct GT 0 then begin
wh = wh[0]
;; Example kind of this field
exemplar = s0[0].(wh)
srci = 0L
endif else begin
;; Check EXTRA
wh = where(strupcase(varname) EQ tn1, ct)
if ct GT 0 then begin
wh = wh[0]
;; Example kind of this field
exemplar = (extra.(wh))[0]
srci = 1L
endif else begin
message, 'ERROR: tag name "'+varname+'" does not exist in input structure'
endelse
endelse
tp = size(exemplar, /type)
dims = size(exemplar, /dimensions)
code = '' ;; Default: code is already in format str
;; If the user put $varname then we must decide on an output format
if fmt_type EQ '$' then begin
case tp of
1: code = 'd' ;; BYTE
2: code = 'd' ;; INT
3: code = 'd' ;; LONG
4: code = 'g' ;; FLOAT
5: code = 'g' ;; DOUBLE
7: code = 's' ;; STRING
12: code = 'd' ;; UINT
13: code = 'd' ;; ULONG
14: code = 'd' ;; ULONG64
15: code = 'd' ;; LONG64
else: message, string(varname, $
format='("ERROR: $",A0," must of real, integer or string type")')
endcase
endif
;; New tag name
tni = string(i, format='("N",I0)')
if n_elements(news) EQ 0 then begin
news = create_struct(tni, exemplar)
imap = [wh]
isrc = [srci]
endif else begin
news = create_struct(news, tni, exemplar)
imap = [imap, wh]
isrc = [isrc, srci]
endelse
;; Add %-style format code to appropriate string
fmts[i] = fmts[i] + '%'+code
endfor
;; Transfer the data to the new output structure
outs = replicate(news, n)
for i = 0, n_elements(imap)-1 do begin
if isrc[i] EQ 0 then begin
outs.(i) = s0.(imap[i])
endif else begin
outs.(i) = extra.(imap[i])
endelse
endfor
ofmt = strjoin(fmts)
;; ;; Replace '"' by '\"' (poor man's REPSTR)
;; ofmts = strsplit(ofmt, '"', /preserve_null, /extract)
;; nquote = n_elements(ofmts)-1
;; if nquote GT 0 then begin
;; ofmts[0:nquote-1] = ofmts[0:nquote-1] + '\"'
;; ofmt = strjoin(ofmts)
;; endif
ofmt1 = '(%"'+ofmt+'")'
return, string(outs, format=ofmt1, $
am_pm=am_pm, days_of_week=days_of_week, months=months)
end