统计性别的分布情况

现在有一个 users collection,我们需要再 BI 中展示出 users 的性别分布情况,数据如下:

[
    {"name": "USER-01","gender": "male"},
    {"name": "USER-02","gender": "female"},
    {"name": "USER-03","gender": "male"},
    {"name": "USER-04","gender": "female"},
    {"name": "USER-05","gender": "male"},
    {"name": "USER-06","gender": "female"},
    {"name": "USER-07","gender": "female"},
    {"name": "USER-08","gender": "female"},
]

Laravel 中实现如下:

use Jenssegers\Mongodb\Collection;

$data = User::query()->raw(function(Collection $collection) {
    return $collection->aggregate([
        [
            '$group' => [
                '_id' => '$gender',
                'count' => [
                    '$sum' => 1
                ]
            ]
        ],
        [
            '$project' => [
                '_id' => 0,
                'gender' => '$_id',
                'count' => 1
            ]
        ]
    ])->toArray();
});

return response()->json($data);

得到的结果如下:

[
    {
        "gender": "male",
        "count": 3
    },
    {
        "gender": "female",
        "count": 5
    }
]

统计出生年份分布情况

现在有一个 users collection,我们需要再 BI 中展示出 users 的出生年分布情况,数据如下:

[
    {"name": "USER-01","birthday": ISODate("1998-04-24T00:00:00.000Z")},
    {"name": "USER-02","birthday": ISODate("1998-04-24T00:00:00.000Z")},
    {"name": "USER-03","birthday": ISODate("1997-04-24T00:00:00.000Z")},
    {"name": "USER-04","birthday": ISODate("2000-04-24T00:00:00.000Z")},
    {"name": "USER-05","birthday": ISODate("1991-04-24T00:00:00.000Z")},
    {"name": "USER-06","birthday": ISODate("1992-04-24T00:00:00.000Z")},
    {"name": "USER-07","birthday": ISODate("1992-04-24T00:00:00.000Z")},
    {"name": "USER-08","birthday": ISODate("1994-04-24T00:00:00.000Z")},
]

Laravel 中实现如下:

use Jenssegers\Mongodb\Collection;

$data = User::query()->raw(function(Collection $collection) {
    return $collection->aggregate([
        [
            '$group' => [
                '_id' => [
                    '$dateToString' => [
                        'format' => '%Y',
                        'date' => '$birthday'
                    ]
                ],
                'count' => [
                    '$sum' => 1
                ]
            ]
        ],
        [
            '$project' => [
                '_id' => 0,
                'year' => '$_id',
                'count' => 1
            ]
        ]
    ])->toArray();
});

return response()->json($data);

得到的结果如下:

[
    {
        "year": "1998",
        "count": 2
    },
    {
        "year": "1997",
        "count": 1
    },
    {
        "year": "2000",
        "count": 1
    },
    {
        "year": "1991",
        "count": 1
    },
    {
        "year": "1992",
        "count": 2
    },
    {
        "year": "1994",
        "count": 1
    }
]

其他查询条件

如果需要增加其他查询条件可以在 Pipeline 中使用 $match,如下面代码所示:

$result = User::query()->raw(function (Collection $collection) {
    return $collection->aggregate([
        [
            '$match' => [
                'status' => 'active',
                'created_at' => [
                    '$gte' => new UTCDateTime('2010-01-01'),
                    '$lte' => new UTCDateTime('2022-08-10')
                ]
            ]
        ],
        [
            '$group' => [
                '_id' => [
                    '$dateToString' => [
                        'format' => '%Y',
                        'date' => '$birthday'
                    ]
                ],
                'count' => [
                    '$sum' => 1
                ]
            ]
        ],
        [
            '$project' => [
                '_id' => 0,
                'year' => '$_id',
                'value' => '$count'
            ]
        ]
    ])->toArray();
});

上面的 $match 中表示只查询用户状态为 active 并且注册于 2010-01-01 ~ 2022-08-10 之间的用户。

总结

ORM 虽然好用省事,但是有些复杂的查询依然无法满足。不知道是不是年纪大了,越来越喜欢手写 SQL 或者 BSON 查询了。

I hope this is helpful, Happy hacking…