My current jq project: create a Diaspora post-abstracter
Given the lack of a search utility on Diaspora*, my evolved strategy has been to create an index or curation of posts, generally with a short summary consisting of the title, a brief summary (usually the first paragraph), the date, and the URL.
I'd like to group these by time segment, say, by month, quarter, or year (probably quarter/year).
And as I'm writing this, I'm thinking that it might be handy to indicate some measure of interactions --- comments, reshares, likes, etc.
My tools for developing this would be my Diaspora* profile data extract, and
jq
, the JSON query tool.It's possible to do some basic extraction and conversion pretty easily. Going from there to a more polished output is ... more complicated.
A typical original post might look like this, (excluding the
subscribed_pods_uris
array):{
"entity_type": "status_message",
"entity_data": {
"author": "dredmorbius@joindiaspora.com",
"guid": "cc046b1e71fb043d",
"created_at": "2012-05-17T19:33:50Z",
"public": true,
"text": "Hey everyone, I'm #NewHere. I'm interested in #debian and #linux, among other things. Thanks for the invite, Atanas Entchev!\r\n\r\nYet another G+ refuge.",
"photos": []
}
}
Key points here are:
entity_type
: Values "status_message" or "reshare".author
: This is the user_id of the author, yours truly (in this case in my DiasporaCom incarnation).guid
: Can be used to construct a URL in the form ofhttps://<hostname>/posts/<guid>
created_at
: The original posting date, in UTC ("Zulu" time).public
: Status, valuestrue
,false
. Also apparently missing in a significant number of posts.text
: The post text itself.
{
"entity_type": "reshare",
"entity_data": {
"author": "dredmorbius@joindiaspora.com",
"guid": "5bfac2041ff20567",
"created_at": "2013-12-15T12:45:08Z",
"root_author": "willhill@joindiaspora.com",
"root_guid": "53e457fd80e73bca"
}
}
Again, excluding the
.subscribed_pods_uris
. In most cases, reshares are of less interest than direc posts.Interestingly, I've a pretty even split between posts and reshares (52%
status_message
, that is, post).My theory in creating an abstract is:
- Automation is good.
- It's easier to peel stuff off an automatically-created abstract than to add bits back in manually.
- The compilation should contain only public posts and exclude reshares.
- It's relatively easy to create a basic extract:
jq '.user.posts[].entity_data | .author, .guid, .created_at, text
Adding in selection and formatting logic gets ... more complicated.
Among other factors,
jq
is a very quirky language.Desired Output Format
I would like to produce output which renders something like this for any given posts:
Diaspora Tips: Pods, Hashtags & Following
For the many Google Plus refugees showing up on Diaspora and Pluspora, some pointers: ...https://diaspora.glasswings.com/posts/a53ac360ae53013611b60218b786018b (2018-10-10 00:45)
What if any options are there for running Federated social networking tools on or through #OpenWRT or related router systems on a single-user or household basis?
I'm trying to coordinate and gather information for #googleplus (and other) users looking to migrate to Fediverse platforms, and I'm aware that OpenWRT, #Turris (I have a #TurrisOmnia), and several other router platforms can run services, mostly #NextCloud that I'm aware. ...https://diaspora.glasswings.com/posts/91f54380af58013612800218b786018b (2018-10-11 07:52)
The original posts can of course be viewed at the URLs shown.
What this is doing is:
- Extracting the first line of the post text itself.
- Stripping all formatting from it.
- Bolding the result by surrounding it in
**
Markdown. - Including the second paragraph, terminating it in an elipsis
...
. - Including a generated URL, based on the GUID, and here parked on Glasswings. (I might also create links to Archive.Today and Archive.Org of the original content.)
- Including the post date, with time in YYYY-MM-DD hh:mm resolution.
Specific questions / challenges:
- How to conditionally export only public posts.
- How to conditionally export only
status_message
(that is, original) posts, rather than reshares. - How to create lagged "oldYear" and "oldMonth" variables.
- How to conditionally output content when computed Month and Year values > oldMonth and oldYear respectively. Goal is to create
## .year
and### .month
segments in output. - How to output up to two paragraphs, where posts may consist of fewer than two separate text lines, and lines may be separated by multiple or only single linefeeds
\r\n
. - Collect and output hashtags used in the post.
- Include counts of comments, reshares, likes, etc. I'm not even sure this is included in the JSON output.
And of course, if I have to invoke other tools for part of the formatting, that's an option, though an all-in-jq solution would be handy.
#jq #json #diaspora #scripting #linux
Diaspora Tips: Pods, Hashtags & Following For the many Google Plus ...
Diaspora Tips: Pods, Hashtags & Following For the many Google Plus refugees showing up on Diaspora and Pluspora, some pointers: Pluspora is an pod, or server, on the Diaspora social network.Glass Wings diaspora* social network
DEFUNCT Carsten Raddatz (劉愷恩) -> now at nerdica
in reply to Doc Edward Morbius • • •Doc Edward Morbius
in reply to Doc Edward Morbius • • •I'd have said "joindiaspora.com", as I'd been on that and followed you from there, but ... well, it's also dead.
@Isaac Kuo also turned up another instance, I believe Socialhome (I need to go looking for that) which is both well-federated and searchable from other instances.
For shins'n'grits:
- https://diasp.org/people/da010950d6e70136b502005056264835
- https://diasp.eu/people/da010950d6e70136b502005056264835
- https://diasp.de/people/da010950d6e70136b502005056264835
- https://diaspora.glasswings.com/people/da010950d6e70136b502005056264835
- https://nerdica.net/people/da010950d6e70136b502005056264835
Sadly, attempting to view a user's profile from a third-party pod does not seem to work. That would be one way to grab a fully-federated instance.Now to track down Isaac's find again...
diaspora - Sign in
diaspora social networkDoc Edward Morbius
in reply to Doc Edward Morbius • • •https://socialhome.network
The user reference hash is different, but you can find it by searching for a specific handle.
Looking for
carstenraddatz@pluspora.com
gives:https://socialhome.network/p/c05bd2c3-426f-4b94-bb0e-28456b36760e/
I'm not aware of a way of automatically scraping / requesting that content, but you should be able to manually page through posts.
The posts are not referenced by the Diaspora* GUID. E.g.:
https://socialhome.network/content/10897028/weil-escher-escher-visualization-art-infi/
Let me take a deeper look at that...
.... I'm not seeing an obvious way to grab a JSON abstract of either profiles or posts.
There's source at GitLab: https://git.feneas.org/socialhome/socialhome
Or @Jason Robinson 🐍🍻 might be able to offer some pointers.
Socialhome HQ - Socialhome
Socialhome