Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Async programming is not easy. For example the following line

    for await ...
will will process the whole file, then give you all lines at once.

So the problem is probably that all lines in all files are parsed at the time creating a bottleneck somewhere and ballooning memory usage. The solution is to use streams, pause the stream in order to parse the line, then continue the stream. And pipe the parsed result to stdout or a tcp stream. That will use minimal memory, and if the GC doesn't do it's job Node.js now supports manual GC. So you could probably parse the whole thing using only a few kilobytes of memory.



N.B. Assuming `readLineStream` in that for await loop is node's readline object, the code is iterating over an asyncIterator[1] which works exactly as what you describe in your solution. (AsyncIterators have a next method that returns a Promise of the next item)

[1] https://nodejs.org/api/readline.html#rlsymbolasynciterator


I don't believe that's true. Consider this infinite generator:

    async function* infinite() {
      while (1) {
        yield 'example'
      }
    }

    (async function () {
      for await (const value of infinite()) {
        console.log(value)
      }
    })()
If the generator was consumed all at once, this would never print (because the generator is infinite). `for await... of` should only consume a single step of the stream / generator at a time. It's just syntactic sugar for the usual calls to .next() and etc. See the docs here[0]

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


"for await" uses async iterators, which is pretty much the same thing as a stream - just way simpler to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: