使用 Spring 和 Hibernate 进行表分区
简介
本文将带你了解如何使用 Spring 和 Hibernate 实现表分区。
表分区的目标是将一个大型表分割为多个较小的分区表,以便关联的表和索引记录可以放入内存缓冲池,从而实现更高效的查找或扫描操作。
使用 PostgreSQL 进行表分区
PostgreSQL 为 表分区 提供了三种策略:
- 列表分区(List Partitioning)
- 范围分区(Range Partitioning)
- Hash 分区(Hash Partitioning)
本例使用列表分区,按大洲来划分数据表。
例如,users
分区如下:
CREATE TABLE users (
id bigint NOT NULL,
first_name varchar(255),
last_name varchar(255),
registered_on timestamp(6),
partition_key varchar(255),
PRIMARY KEY (id, partition_key)
)
PARTITION BY LIST (partition_key)
CREATE TABLE users_asia
PARTITION OF users FOR VALUES IN ('Asia')
CREATE TABLE users_africa
PARTITION OF users FOR VALUES IN ('Africa')
CREATE TABLE users_north_america
PARTITION OF users FOR VALUES IN ('North America')
CREATE TABLE users_south_america
PARTITION OF users FOR VALUES IN ('South America')
CREATE TABLE users_europe
PARTITION OF users FOR VALUES IN ('Europe')
CREATE TABLE users_australia
PARTITION OF users FOR VALUES IN ('Australia')
posts
表分区如下:
CREATE TABLE posts (
id bigint NOT NULL,
title varchar(255),
created_on timestamp(6),
user_id bigint,
partition_key varchar(255),
PRIMARY KEY (id, partition_key)
)
PARTITION BY LIST (partition_key)
CREATE TABLE posts_asia
PARTITION OF posts FOR VALUES IN ('Asia')
CREATE TABLE posts_africa
PARTITION OF posts FOR VALUES IN ('Africa')
CREATE TABLE posts_north_america
PARTITION OF posts FOR VALUES IN ('North America')
CREATE TABLE posts_south_america
PARTITION OF posts FOR VALUES IN ('South America')
CREATE TABLE posts_europe
PARTITION OF posts FOR VALUES IN ('Europe')
CREATE TABLE posts_australia
PARTITION OF posts FOR VALUES IN ('Australia')
ALTER TABLE IF EXISTS posts
ADD CONSTRAINT fk_posts_user_id
FOREIGN KEY (user_id, partition_key)
REFERENCES users
用 JPA 和 Hibernate 映射表分区
使用 Spring 和 Hibernate 实现表分区需要在读写分区表的记录时提供分区 Key。
首先,创建一个 PartitionAware
基类,每个映射到分区表的 JPA 实体都将继承该基类:
@MappedSuperclass
@FilterDef(
name = PartitionAware.PARTITION_KEY,
parameters = @ParamDef(
name = PartitionAware.PARTITION_KEY,
type = String.class
)
)
@Filter(
name = PartitionAware.PARTITION_KEY,
condition = "partition_key = :partitionKey"
)
public abstract class PartitionAware<T extends PartitionAware> {
public static final String PARTITION_KEY = "partitionKey";
@Column(name = "partition_key")
@PartitionKey
private String partitionKey;
public String getPartitionKey() {
return partitionKey;
}
public T setPartitionKey(String partitionKey) {
this.partitionKey = partitionKey;
return (T) this;
}
public T setPartition(Partition partition) {
this.partitionKey = partition.getKey();
return (T) this;
}
}
@Filter
注解允许动态启用分区过滤功能@PartitionKey
注解是在 Hibernate 6.2 中引入的,它允定义一个分区 key。
Partition
对象是一个枚举,用于定义应用支持的分区:
public enum Partition {
ASIA("Asia"),
AFRICA("Africa"),
NORTH_AMERICA("North America"),
SOUTH_AMERICA("South America"),
EUROPE("Europe"),
AUSTRALIA("Australia"),
;
private final String key;
Partition(String key) {
this.key = key;
}
public String getKey() {
return key;
}
}
@MappedSuperclass
注解用于表示实体类的父类。它可以用来定义多个实体类之间的公共映射和字段。它本身并不被视为一个实体,不能直接持久化到数据库中。
User
实体继承了 PartitionAware
类,如下:
@Entity
@Table(name = "users")
public class User extends PartitionAware<User> {
@Id
@GeneratedValue
private Long id;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
@Column(name = "registered_on")
@CreationTimestamp
private LocalDateTime createdOn = LocalDateTime.now();
// get / set 方法省略
}
Post
实体同样,如下:
@Entity
@Table(name = "posts")
public class Post extends PartitionAware<Post> {
@Id
@GeneratedValue
private Long id;
private String title;
@Column(name = "created_on")
@CreationTimestamp
private LocalDateTime createdOn;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "user_id")
private User user;
// get / set 方法省略
}
现在,通过 Spring EntityManager
初始化机制在创建 JPA EntityManager
时默认启用分区过滤:
@Bean
public JpaTransactionManager transactionManager(
EntityManagerFactory entityManagerFactory){
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setEntityManagerFactory(entityManagerFactory);
transactionManager.setEntityManagerInitializer(entityManager -> {
User user = UserContext.getCurrent();
if (user != null) {
entityManager.unwrap(Session.class)
.enableFilter(PartitionAware.PARTITION_KEY)
.setParameter(
PartitionAware.PARTITION_KEY,
user.getPartitionKey()
);
}
});
return transactionManager;
}
当前分区是根据当前登录的 User
设置的。由于启用了分区 Filter
,每次实体查询都会从当前登录 User
的分区中选择 User
和 Post
实体。
创建以下 Hibernate PersistEventListener
,在当前登录的 User
持久化的每个实体上设置 partition_key
列
public class PartitionAwareInsertEventListener
implements PersistEventListener {
public static final PartitionAwareInsertEventListener INSTANCE =
new PartitionAwareInsertEventListener();
@Override
public void onPersist(PersistEvent event)
throws HibernateException {
final Object entity = event.getObject();
if (entity instanceof PartitionAware partitionAware) {
if (partitionAware.getPartitionKey() == null) {
FilterImpl partitionKeyFilter = (FilterImpl) event
.getSession()
.getEnabledFilter(PartitionAware.PARTITION_KEY);
partitionAware.setPartitionKey(
(String) partitionKeyFilter
.getParameter(PartitionAware.PARTITION_KEY)
);
}
}
}
@Override
public void onPersist(PersistEvent event, PersistContext persistContext)
throws HibernateException {
onPersist(event);
}
}
使用以下 PartitionAwareEventListenerIntegrator
注册 PartitionAwareInsertEventListener
:
public class PartitionAwareEventListenerIntegrator
implements Integrator {
public static final PartitionAwareEventListenerIntegrator INSTANCE =
new PartitionAwareEventListenerIntegrator();
@Override
public void integrate(
Metadata metadata,
BootstrapContext bootstrapContext,
SessionFactoryImplementor sessionFactory) {
sessionFactory
.getServiceRegistry()
.getService(EventListenerRegistry.class)
.prependListeners(
EventType.PERSIST,
PartitionAwareInsertEventListener.INSTANCE
);
}
@Override
public void disintegrate(
SessionFactoryImplementor sessionFactory,
SessionFactoryServiceRegistry serviceRegistry) {
}
}
PartitionAwareEventListenerIntegrator
是通过基于 Java 的属性配置提供给 Spring 的:
properties.put(
EntityManagerFactoryBuilderImpl.INTEGRATOR_PROVIDER,
(IntegratorProvider) () -> List.of(
PartitionAwareEventListenerIntegrator.INSTANCE
)
);
就这样!
测试
持久化一个 User
和 3 个 Post
实体:
final User vlad = new User()
.setFirstName("Vlad")
.setLastName("Mihalcea")
.setPartition(Partition.EUROPE);
userRepository.persist(vlad);
UserContext.logIn(vlad);
forumService.createPosts(LongStream.rangeClosed(1, POST_COUNT)
.mapToObj(postId -> new Post()
.setTitle(
String.format(
"High-Performance Java Persistence - Part %d",
postId
)
)
.setUser(vlad)
)
.toList()
);
Hibernate 将生成以下 SQL INSERT
语句:
INSERT INTO users (
registered_on,
first_name,
last_name,
partition_key,
id
)
VALUES (
'2023-11-09 11:22:55.802704',
'Vlad, Mihalcea',
'Europe',
1
)
INSERT INTO posts (
created_on,
partition_key,
title,
user_id,
id
)
VALUES (
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 1',
1,
1
), (
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 2',
1,
2
),(
'2023-11-09 11:19:55.856126',
'Europe',
'High-Performance Java Persistence - Part 3',
1,
3
)
检索 Post
实体:
List<Post> posts = forumService.findByIds(
LongStream.rangeClosed(1, POST_COUNT).boxed().toList()
);
Hibernate 将根据当前登录 User
的分区进行过滤:
SELECT
p.id,
p.created_on,
p.partition_key,
p.title,
p.user_id
FROM
posts p
WHERE
p.partition_key = 'Europe' AND
p.id in (1, 2, 3)
在对该 SQL 查询运行 EXPLAIN ANALYZE
时,可以看到只有 posts_europe
表被扫描:
Seq Scan on posts_europe p
(cost=0.00..11.14 rows=1 width=1056)
(actual time=0.022..0.023 rows=3 loops=1)
Filter: (
((partition_key)::text = 'Europe'::text) AND
(id = ANY ('{1,2,3}'::bigint[]))
)
参考:https://vladmihalcea.com/table-partitioning-spring-hibernate/