How do others handle publisher data in XML when it starts to get large?

It generally isn't simply a matter of bumping memory and keeping on going. The
time taken to process these large files really starts to become an issue also.
(Aaron, I'm working on the inferior Windows here, but I have seen similar
problems there also).

When all the data is one big XML document, it isn't always easy to split it up
into chunks (how to handle relationships between elements)

With a text shim, one can use alternative import mechanims (the open source
shim, or a pre-processor that breaks down the data into discrete records which
IDM likes so much better. (We have often used Saxon in a batch script to
pre-process into individual text documents per object/identity)

However, what about a REST or SOAP shim that consumes a huge XML document?

How have others solved this?