Max Nikulin writes: > On 10/08/2022 18:43, Ihor Radchenko wrote: >> Ihor Radchenko writes: >> >> I have updated the patch to use "__/id" when id is too short. >> Any objections? >> >> +When ID is too short (less than 3 chars), use its md5 hash to create > > Misleading docs, you do not use md5. Oops. Will fix. >> +the path." >> + (if (< (length id) 3) >> + (format "--/%s" id) > > Please, do not use path components starting with dash, it is terrible > for CLI tools. By the way, you promised underscores, not dashes. Why? I have no opinion about the possible dummy folder name, except that it should fit the general pattern we already have "xx/..." or "YYYYMM/...". >> + (format "%s/%s" >> + (substring id 0 2) >> + (substring id 2)))) > > Ihor, I have not look into the code around, so my suggestions may have > no sense. > > Is it possible to pass empty string as ID to these functions? Should it > be explicitly checked? It should not be possible under normal operation. > What if ID contains "/" that can not be used in file name? Windows has > more forbidden characters. Then, creating the attachment dir will fail in front of the user. I am not sure if we really need to do much about this. At least, not until we get a bug report. If we really need to solve the edge cases with attach dir generation, we may need to change the overall design in org-attach/org-id to something more restrictive. I am not sure if it is worth the effort compared to other directions where to improve Org. > Do you expect any problem if here (and for timestamp-based ids) > directory component is just padded with some character (unsure what can > better than underscore) to required length? > > "x" -> "_x/x" or "______x/x" > "xy" -> "xy/xy" or "_____xy/xy" > etc. > > From my point of view it might be a bit better than "__" and "unknown". Does it make sense? Timestamp-based IDs are like 20220810T210121.478800 by default. Then, the top-level folders will be like 202208/... If the actual format is different, we cannot possibly expect the timestamp to follow the same pattern, which is why I chose "unknown" referring to unknown date. On the other hand, if an ID has more than 6 characters, it will generate nonsense top-level folder with or without the patch. We could go further and match the ID against [1-9][0-9]\{5\} and put the folder into "unknown/ID" instead, but that would be a breaking change. In summary, I am more of less neutral towards this fallback format, except that it would be nice to be able to recover the ID from the directory path. Padding should be recoverable. I slightly dislike the "___xx" compared to "______" because it will create a proliferation of top-level folders as opposed to cramping the non-standard IDs into a single "______" folder. Attaching the updated patch.